The first two chunks of this r markdown file after the r setup allow for plot zooming, but it also means that the html file must be opened in a browser to view the document properly. When it knits in RStudio the preview will appear empty but the html when opened in a browser will have all the info and you can click on each plot to Zoom in on it.

Before you begin

Notes

A few notes about this script.

If you are running this with the 2022-2023 data make sure you download the whole (OSM_2022-2023 GitHub repository)[https://github.com/ACMElabUvic/OSM_2022-2023] from the ACMElabUvic GitHub. This will ensure you have all the files, data, and proper folder structure you will need to run this code and associated analyses.

Also make sure you open RStudio through the R project (OSM_2022-2023.Rproj) this will automatically set your working directory to the correct place (wherever you saved the repository) and ensure you don’t have to change the file paths for some of the data.

Lastly, if you are looking to adapt this code for a future year of data, you will want to ensure you have run the 1_ACME_camera_script_9-2-2024.R or .Rmd with your data as there is much data formatting, cleaning, and restructuring that has to be done before this code will work. Helpful note: The files are numbered in the order they are used for this analysis.

If you have question please email the most recent author, currently

Marissa A. Dyck
Postdoctoral research fellow
University of Victoria
School of Environmental Studies
Email: marissadyck17@gmail.com

(update/add authors as needed)

Install packages

If you don’t already have the following packages installed, use the code below to install them.

install.packages('tidyverse')
install.packages('PerformanceAnalytics')
install.packages('Hmisc')

Load libraries

Then load the packages to your library.

library(tidyverse) # data tidying, visualization, and much more; this will load all tidyverse packages, can see complete list using tidyverse_packages()
library(PerformanceAnalytics)    #Used to generate a correlation plot
library(Hmisc) # used to generate histograms for all variables in data frame

Data

Import data

To do any analysis with the detection data from the OSM arrays, we will want to pair it with the covariate data which has human factors indices (HFI) and landcover data (VEG) for each site. There are a lot of covariates/features in these datasets that need to be grouped together to be usable, which is what this script covers.

Let’s read in the covariate data for all 6 LUs (outputs from the 2021-2022 and 2022-2023 1_ACME_camera_script_9-2-2024.Rmd). We’ve copied the 2021-2022 data from the OSM_2021-2022 repository and saved it to the processed folder so we can read in both data files with the same file path.

# model covariates (merged HFI and VEG data from the ACME_camera_script_9-2-2024.R or .Rmd)
covariates <-  file.path('data/processed',
                         
                         c('OSM_covariates_2022.csv',
                           'OSM_covariates_2021.csv')) %>% 
  
  map(~.x %>%
        read_csv(.,
                 
                 # set the column types to read in correctly
                 col_types = cols(array = col_factor(),
                                  camera = col_factor(),
                                  site = col_factor(),
                                  buff_dist = col_factor(),
                                  .default = col_number()))) %>% 
  
  # give names to each data frame in list
  purrr::set_names('covs_2022',
                   'covs_2021') # R doesn't like when they are just numbers, you can make it work but it's annoying to call the data frame later so I've called them covs_year
## Warning: One or more parsing issues, call `problems()` on your data frame for details,
## e.g.:
##   dat <- vroom(...)
##   problems(dat)
# check variable structure
str(covariates)
## List of 2
##  $ covs_2022: spc_tbl_ [3,100 Ă— 119] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
##   ..$ array                       : Factor w/ 4 levels "LU13","LU15",..: 1 1 1 1 1 1 1 1 1 1 ...
##   ..$ camera                      : Factor w/ 96 levels "18","15","03",..: 1 2 3 4 5 6 7 8 9 10 ...
##   ..$ site                        : Factor w/ 155 levels "LU13_18","LU13_15",..: 1 2 3 4 5 6 7 8 9 10 ...
##   ..$ buff_dist                   : Factor w/ 20 levels "250","500","750",..: 1 1 1 1 1 1 1 1 1 1 ...
##   ..$ vegetated_edge_roads        : num [1:3100] 0 0.0858 0 0 0 ...
##   ..$ harvest_area                : num [1:3100] 0 0 0.687 0.337 0 ...
##   ..$ road_gravel_1l              : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ conventional_seismic        : num [1:3100] 0 0.03277 0 0.00889 0.01144 ...
##   ..$ tame_pasture                : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ pipeline                    : num [1:3100] 0 0.068 0 0 0.0301 ...
##   ..$ road_gravel_2l              : num [1:3100] 0 0 0 0 0 ...
##   ..$ trail                       : num [1:3100] 0.00588 0.0028 0 0.00196 0 ...
##   ..$ well_bitumen                : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ rough_pasture               : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ well_aband                  : num [1:3100] 0 0 0 0 0.0322 ...
##   ..$ road_unclassified           : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ crop                        : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ low_impact_seismic          : num [1:3100] 0 0 0 0 0.0523 ...
##   ..$ clearing_unknown            : num [1:3100] 0.0923 0.0697 0 0 0 ...
##   ..$ cultivation_abandoned       : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ road_paved_undiv_2l         : num [1:3100] 0 0.0174 0 0 0 ...
##   ..$ road_unimproved             : num [1:3100] 0 0 0 0 0 ...
##   ..$ truck_trail                 : num [1:3100] 0 0 0 0.0139 0 ...
##   ..$ dugout                      : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ road_paved_undiv_1l         : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ well_gas                    : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ vegetated_edge_railways     : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ harvest_area_white_zone     : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ country_residence           : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ borrowpit_dry               : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ rural_residence             : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ borrowpit_wet               : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ borrowpits                  : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ grvl_sand_pit               : num [1:3100] 0 0.0873 0 0 0 ...
##   ..$ ris_reclaimed_temp          : num [1:3100] 0 0.0477 0 0 0 ...
##   ..$ ris_clearing_unknown        : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ ris_drainage                : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ ris_mines_oilsands          : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ ris_overburden_dump         : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ ris_facility_operations     : num [1:3100] 0 0 0 0 0 ...
##   ..$ transmission_line           : num [1:3100] 0.0642 0 0 0 0.091 ...
##   ..$ ris_tailing_pond            : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ clearing_wellpad_unconfirmed: num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ mines_oilsands              : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ ris_soil_replaced           : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ road_paved_1l               : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ ris_oilsands_rms            : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ ris_facility_unknown        : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ ris_borrowpits              : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ ris_transmission_line       : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ ris_soil_salvaged           : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ ris_road                    : num [1:3100] 0 0 0 0 0 ...
##   ..$ ris_plant                   : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ urban_residence             : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ facility_other              : num [1:3100] 0 0 0 0 0 ...
##   ..$ airp_runway                 : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ runway                      : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ ris_reclaimed_permanent     : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ urban_industrial            : num [1:3100] 0.291 0 0 0 0 ...
##   ..$ lagoon                      : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ facility_unknown            : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ residence_clearing          : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ well_cased                  : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ road_unpaved_2l             : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ road_paved_3l               : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ surrounding_veg             : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ rlwy_sgl_track              : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ road_winter                 : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ sump                        : num [1:3100] 0 0 0 0 0 ...
##   ..$ greenspace                  : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ road_paved_2l               : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ well_other                  : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ canal                       : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ reservoir                   : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ well_cleared_not_confirmed  : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ misc_oil_gas_facility       : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ camp_industrial             : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ ris_camp_industrial         : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ oil_gas_plant               : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ well_unknown                : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ ris_utilities               : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ cfo                         : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ recreation                  : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ campground                  : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ peat                        : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ golfcourse                  : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ landfill                    : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ transfer_station            : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ mill                        : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ road_paved_div              : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ rlwy_spur                   : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ well_cleared_not_drilled    : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ open_pit_mine               : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ well_oil                    : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ road_paved_4l               : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ mines_pitlake               : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ ris_reclaimed_certified     : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ ris_windrow                 : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ tailing_pond                : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##   .. [list output truncated]
##   ..- attr(*, "spec")=
##   .. .. cols(
##   .. ..   .default = col_number(),
##   .. ..   array = col_factor(levels = NULL, ordered = FALSE, include_na = FALSE),
##   .. ..   camera = col_factor(levels = NULL, ordered = FALSE, include_na = FALSE),
##   .. ..   site = col_factor(levels = NULL, ordered = FALSE, include_na = FALSE),
##   .. ..   buff_dist = col_factor(levels = NULL, ordered = FALSE, include_na = FALSE),
##   .. ..   vegetated_edge_roads = col_number(),
##   .. ..   harvest_area = col_number(),
##   .. ..   road_gravel_1l = col_number(),
##   .. ..   conventional_seismic = col_number(),
##   .. ..   tame_pasture = col_number(),
##   .. ..   pipeline = col_number(),
##   .. ..   road_gravel_2l = col_number(),
##   .. ..   trail = col_number(),
##   .. ..   well_bitumen = col_number(),
##   .. ..   rough_pasture = col_number(),
##   .. ..   well_aband = col_number(),
##   .. ..   road_unclassified = col_number(),
##   .. ..   crop = col_number(),
##   .. ..   low_impact_seismic = col_number(),
##   .. ..   clearing_unknown = col_number(),
##   .. ..   cultivation_abandoned = col_number(),
##   .. ..   road_paved_undiv_2l = col_number(),
##   .. ..   road_unimproved = col_number(),
##   .. ..   truck_trail = col_number(),
##   .. ..   dugout = col_number(),
##   .. ..   road_paved_undiv_1l = col_number(),
##   .. ..   well_gas = col_number(),
##   .. ..   vegetated_edge_railways = col_number(),
##   .. ..   harvest_area_white_zone = col_number(),
##   .. ..   country_residence = col_number(),
##   .. ..   borrowpit_dry = col_number(),
##   .. ..   rural_residence = col_number(),
##   .. ..   borrowpit_wet = col_number(),
##   .. ..   borrowpits = col_number(),
##   .. ..   grvl_sand_pit = col_number(),
##   .. ..   ris_reclaimed_temp = col_number(),
##   .. ..   ris_clearing_unknown = col_number(),
##   .. ..   ris_drainage = col_number(),
##   .. ..   ris_mines_oilsands = col_number(),
##   .. ..   ris_overburden_dump = col_number(),
##   .. ..   ris_facility_operations = col_number(),
##   .. ..   transmission_line = col_number(),
##   .. ..   ris_tailing_pond = col_number(),
##   .. ..   clearing_wellpad_unconfirmed = col_number(),
##   .. ..   mines_oilsands = col_number(),
##   .. ..   ris_soil_replaced = col_number(),
##   .. ..   road_paved_1l = col_number(),
##   .. ..   ris_oilsands_rms = col_number(),
##   .. ..   ris_facility_unknown = col_number(),
##   .. ..   ris_borrowpits = col_number(),
##   .. ..   ris_transmission_line = col_number(),
##   .. ..   ris_soil_salvaged = col_number(),
##   .. ..   ris_road = col_number(),
##   .. ..   ris_plant = col_number(),
##   .. ..   urban_residence = col_number(),
##   .. ..   facility_other = col_number(),
##   .. ..   airp_runway = col_number(),
##   .. ..   runway = col_number(),
##   .. ..   ris_reclaimed_permanent = col_number(),
##   .. ..   urban_industrial = col_number(),
##   .. ..   lagoon = col_number(),
##   .. ..   facility_unknown = col_number(),
##   .. ..   residence_clearing = col_number(),
##   .. ..   well_cased = col_number(),
##   .. ..   road_unpaved_2l = col_number(),
##   .. ..   road_paved_3l = col_number(),
##   .. ..   surrounding_veg = col_number(),
##   .. ..   rlwy_sgl_track = col_number(),
##   .. ..   road_winter = col_number(),
##   .. ..   sump = col_number(),
##   .. ..   greenspace = col_number(),
##   .. ..   road_paved_2l = col_number(),
##   .. ..   well_other = col_number(),
##   .. ..   canal = col_number(),
##   .. ..   reservoir = col_number(),
##   .. ..   well_cleared_not_confirmed = col_number(),
##   .. ..   misc_oil_gas_facility = col_number(),
##   .. ..   camp_industrial = col_number(),
##   .. ..   ris_camp_industrial = col_number(),
##   .. ..   oil_gas_plant = col_number(),
##   .. ..   well_unknown = col_number(),
##   .. ..   ris_utilities = col_number(),
##   .. ..   cfo = col_number(),
##   .. ..   recreation = col_number(),
##   .. ..   campground = col_number(),
##   .. ..   peat = col_number(),
##   .. ..   golfcourse = col_number(),
##   .. ..   landfill = col_number(),
##   .. ..   transfer_station = col_number(),
##   .. ..   mill = col_number(),
##   .. ..   road_paved_div = col_number(),
##   .. ..   rlwy_spur = col_number(),
##   .. ..   well_cleared_not_drilled = col_number(),
##   .. ..   open_pit_mine = col_number(),
##   .. ..   well_oil = col_number(),
##   .. ..   road_paved_4l = col_number(),
##   .. ..   mines_pitlake = col_number(),
##   .. ..   ris_reclaimed_certified = col_number(),
##   .. ..   ris_windrow = col_number(),
##   .. ..   tailing_pond = col_number(),
##   .. ..   rlwy_mlt_track = col_number(),
##   .. ..   rlwy_dbl_track = col_number(),
##   .. ..   ris_waste = col_number(),
##   .. ..   interchange_ramp = col_number(),
##   .. ..   road_paved_5l = col_number(),
##   .. ..   ris_airp_runway = col_number(),
##   .. ..   fruit_vegetables = col_number(),
##   .. ..   road_unpaved_1l = col_number(),
##   .. ..   ris_reclaim_ready = col_number(),
##   .. ..   ris_tank_farm = col_number(),
##   .. ..   lc_class20 = col_number(),
##   .. ..   lc_class32 = col_number(),
##   .. ..   lc_class33 = col_number(),
##   .. ..   lc_class34 = col_number(),
##   .. ..   lc_class50 = col_number(),
##   .. ..   lc_class110 = col_number(),
##   .. ..   lc_class120 = col_number(),
##   .. ..   lc_class210 = col_number(),
##   .. ..   lc_class220 = col_number(),
##   .. ..   lc_class230 = col_number()
##   .. .. )
##   ..- attr(*, "problems")=<externalptr> 
##  $ covs_2021: spc_tbl_ [1,560 Ă— 80] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
##   ..$ array                       : Factor w/ 2 levels "LU2","LU3": 1 1 1 1 1 1 1 1 1 1 ...
##   ..$ camera                      : Factor w/ 58 levels "03","05","100",..: 1 2 3 4 5 6 7 8 9 10 ...
##   ..$ site                        : Factor w/ 78 levels "LU2_03","LU2_05",..: 1 2 3 4 5 6 7 8 9 10 ...
##   ..$ buff_dist                   : Factor w/ 20 levels "250","500","750",..: 1 1 1 1 1 1 1 1 1 1 ...
##   ..$ pipeline                    : num [1:1560] 0 0 0.0483 0 0.0218 ...
##   ..$ harvest_area                : num [1:1560] 0 0 0.0267 0 0 ...
##   ..$ misc_oil_gas_facility       : num [1:1560] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ transmission_line           : num [1:1560] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ conventional_seismic        : num [1:1560] 0.04091 0.00833 0.00259 0 0.00439 ...
##   ..$ low_impact_seismic          : num [1:1560] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ well_aband                  : num [1:1560] 0.0203 0 0 0 0 ...
##   ..$ well_gas                    : num [1:1560] 0 0 0 0 0.0391 ...
##   ..$ well_other                  : num [1:1560] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ well_bitumen                : num [1:1560] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ clearing_unknown            : num [1:1560] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ open_pit_mine               : num [1:1560] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ vegetated_edge_roads        : num [1:1560] 0.000958 0.022859 0.072033 0.021681 0.029158 ...
##   ..$ road_paved_undiv_2l         : num [1:1560] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ road_gravel_1l              : num [1:1560] 0 0.0227 0.0215 0.0216 0.0125 ...
##   ..$ road_unimproved             : num [1:1560] 0 0 0 0 0.00742 ...
##   ..$ harvest_area_white_zone     : num [1:1560] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ trail                       : num [1:1560] 0 0.012877 0 0.000893 0 ...
##   ..$ crop                        : num [1:1560] 0 0 0 0 0.000715 ...
##   ..$ rough_pasture               : num [1:1560] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ tame_pasture                : num [1:1560] 0 0 0 0 0.0153 ...
##   ..$ rural_residence             : num [1:1560] 0 0 0 0 0.00346 ...
##   ..$ urban_residence             : num [1:1560] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ greenspace                  : num [1:1560] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ recreation                  : num [1:1560] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ runway                      : num [1:1560] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ well_cased                  : num [1:1560] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ facility_unknown            : num [1:1560] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ urban_industrial            : num [1:1560] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ clearing_wellpad_unconfirmed: num [1:1560] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ grvl_sand_pit               : num [1:1560] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ vegetated_edge_railways     : num [1:1560] 0 0 0 0 0.127 ...
##   ..$ road_unclassified           : num [1:1560] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ borrowpit_wet               : num [1:1560] 0 0 0 0 0 ...
##   ..$ borrowpit_dry               : num [1:1560] 0 0 0 0 0 ...
##   ..$ borrowpits                  : num [1:1560] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ residence_clearing          : num [1:1560] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ campground                  : num [1:1560] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ well_cleared_not_confirmed  : num [1:1560] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ camp_industrial             : num [1:1560] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ oil_gas_plant               : num [1:1560] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ truck_trail                 : num [1:1560] 0.000815 0 0 0 0 ...
##   ..$ road_gravel_2l              : num [1:1560] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ road_paved_undiv_1l         : num [1:1560] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ sump                        : num [1:1560] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ dugout                      : num [1:1560] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ country_residence           : num [1:1560] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ mill                        : num [1:1560] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ road_paved_2l               : num [1:1560] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ facility_other              : num [1:1560] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ surrounding_veg             : num [1:1560] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ rlwy_sgl_track              : num [1:1560] 0 0 0 0 0.0244 ...
##   ..$ well_cleared_not_drilled    : num [1:1560] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ well_unknown                : num [1:1560] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ cultivation_abandoned       : num [1:1560] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ golfcourse                  : num [1:1560] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ airp_runway                 : num [1:1560] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ lagoon                      : num [1:1560] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ reservoir                   : num [1:1560] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ transfer_station            : num [1:1560] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ landfill                    : num [1:1560] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ mines_pitlake               : num [1:1560] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ rlwy_spur                   : num [1:1560] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ road_paved_1l               : num [1:1560] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ canal                       : num [1:1560] 0 0 0 0 0.0196 ...
##   ..$ gridcll                     : num [1:1560] 2 2 2 2 2 2 2 2 2 2 ...
##   ..$ lab                         : num [1:1560] NA NA NA NA NA NA NA NA NA NA ...
##   ..$ lc_class20                  : num [1:1560] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ lc_class33                  : num [1:1560] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ lc_class34                  : num [1:1560] 0 0.19 0.212 0.163 0.366 ...
##   ..$ lc_class50                  : num [1:1560] 0 0.171 0 0 0 ...
##   ..$ lc_class110                 : num [1:1560] 0 0 0.214 0 0.406 ...
##   ..$ lc_class120                 : num [1:1560] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ lc_class210                 : num [1:1560] 0.1584 0.0821 0.0307 0.2237 0.1831 ...
##   ..$ lc_class220                 : num [1:1560] 0.838 0.134 0 0.613 0 ...
##   ..$ lc_class230                 : num [1:1560] 0.00411 0.42255 0.54321 0 0.0455 ...
##   ..- attr(*, "spec")=
##   .. .. cols(
##   .. ..   .default = col_number(),
##   .. ..   array = col_factor(levels = NULL, ordered = FALSE, include_na = FALSE),
##   .. ..   camera = col_factor(levels = NULL, ordered = FALSE, include_na = FALSE),
##   .. ..   site = col_factor(levels = NULL, ordered = FALSE, include_na = FALSE),
##   .. ..   buff_dist = col_factor(levels = NULL, ordered = FALSE, include_na = FALSE),
##   .. ..   pipeline = col_number(),
##   .. ..   harvest_area = col_number(),
##   .. ..   misc_oil_gas_facility = col_number(),
##   .. ..   transmission_line = col_number(),
##   .. ..   conventional_seismic = col_number(),
##   .. ..   low_impact_seismic = col_number(),
##   .. ..   well_aband = col_number(),
##   .. ..   well_gas = col_number(),
##   .. ..   well_other = col_number(),
##   .. ..   well_bitumen = col_number(),
##   .. ..   clearing_unknown = col_number(),
##   .. ..   open_pit_mine = col_number(),
##   .. ..   vegetated_edge_roads = col_number(),
##   .. ..   road_paved_undiv_2l = col_number(),
##   .. ..   road_gravel_1l = col_number(),
##   .. ..   road_unimproved = col_number(),
##   .. ..   harvest_area_white_zone = col_number(),
##   .. ..   trail = col_number(),
##   .. ..   crop = col_number(),
##   .. ..   rough_pasture = col_number(),
##   .. ..   tame_pasture = col_number(),
##   .. ..   rural_residence = col_number(),
##   .. ..   urban_residence = col_number(),
##   .. ..   greenspace = col_number(),
##   .. ..   recreation = col_number(),
##   .. ..   runway = col_number(),
##   .. ..   well_cased = col_number(),
##   .. ..   facility_unknown = col_number(),
##   .. ..   urban_industrial = col_number(),
##   .. ..   clearing_wellpad_unconfirmed = col_number(),
##   .. ..   grvl_sand_pit = col_number(),
##   .. ..   vegetated_edge_railways = col_number(),
##   .. ..   road_unclassified = col_number(),
##   .. ..   borrowpit_wet = col_number(),
##   .. ..   borrowpit_dry = col_number(),
##   .. ..   borrowpits = col_number(),
##   .. ..   residence_clearing = col_number(),
##   .. ..   campground = col_number(),
##   .. ..   well_cleared_not_confirmed = col_number(),
##   .. ..   camp_industrial = col_number(),
##   .. ..   oil_gas_plant = col_number(),
##   .. ..   truck_trail = col_number(),
##   .. ..   road_gravel_2l = col_number(),
##   .. ..   road_paved_undiv_1l = col_number(),
##   .. ..   sump = col_number(),
##   .. ..   dugout = col_number(),
##   .. ..   country_residence = col_number(),
##   .. ..   mill = col_number(),
##   .. ..   road_paved_2l = col_number(),
##   .. ..   facility_other = col_number(),
##   .. ..   surrounding_veg = col_number(),
##   .. ..   rlwy_sgl_track = col_number(),
##   .. ..   well_cleared_not_drilled = col_number(),
##   .. ..   well_unknown = col_number(),
##   .. ..   cultivation_abandoned = col_number(),
##   .. ..   golfcourse = col_number(),
##   .. ..   airp_runway = col_number(),
##   .. ..   lagoon = col_number(),
##   .. ..   reservoir = col_number(),
##   .. ..   transfer_station = col_number(),
##   .. ..   landfill = col_number(),
##   .. ..   mines_pitlake = col_number(),
##   .. ..   rlwy_spur = col_number(),
##   .. ..   road_paved_1l = col_number(),
##   .. ..   canal = col_number(),
##   .. ..   gridcll = col_number(),
##   .. ..   lab = col_number(),
##   .. ..   lc_class20 = col_number(),
##   .. ..   lc_class33 = col_number(),
##   .. ..   lc_class34 = col_number(),
##   .. ..   lc_class50 = col_number(),
##   .. ..   lc_class110 = col_number(),
##   .. ..   lc_class120 = col_number(),
##   .. ..   lc_class210 = col_number(),
##   .. ..   lc_class220 = col_number(),
##   .. ..   lc_class230 = col_number()
##   .. .. )
##   ..- attr(*, "problems")=<externalptr>

You may get a warning about parsing issues, don’t panic this is fine.

Join data

We want one singular covariate data frame, not two list elements with separate data frames as we have now. So we need to join the two data frames. We’ve done our best to ensure these are formatted similarly but unfortunately they still don’t have the exact same number of columns so they won’t rbind nicely with the base R function.

This is likely to be the case each year, but we can use the dplyr function bind_rows() which will rbind any rows where the columns match and will fill any rows where there are extra columns with NAs.

covariates_merged <- dplyr::bind_rows(covariates$covs_2022,
                                      covariates$covs_2021)

head(covariates_merged)
## # A tibble: 6 Ă— 121
##   array camera site   buff_dist vegetated_edge_roads harvest_area road_gravel_1l
##   <fct> <fct>  <fct>  <fct>                    <dbl>        <dbl>          <dbl>
## 1 LU13  18     LU13_… 250                     0             0                  0
## 2 LU13  15     LU13_… 250                     0.0858        0                  0
## 3 LU13  03     LU13_… 250                     0             0.687              0
## 4 LU13  34     LU13_… 250                     0             0.337              0
## 5 LU13  57     LU13_… 250                     0             0                  0
## 6 LU13  16     LU13_… 250                     0             0                  0
## # ℹ 114 more variables: conventional_seismic <dbl>, tame_pasture <dbl>,
## #   pipeline <dbl>, road_gravel_2l <dbl>, trail <dbl>, well_bitumen <dbl>,
## #   rough_pasture <dbl>, well_aband <dbl>, road_unclassified <dbl>, crop <dbl>,
## #   low_impact_seismic <dbl>, clearing_unknown <dbl>,
## #   cultivation_abandoned <dbl>, road_paved_undiv_2l <dbl>,
## #   road_unimproved <dbl>, truck_trail <dbl>, dugout <dbl>,
## #   road_paved_undiv_1l <dbl>, well_gas <dbl>, vegetated_edge_railways <dbl>, …

Let’s check over this data to make sure the bind worked how we expected it to.

Structure

While we specified how the columns should read in when we imported the data, this could change during the merge or from year-to-year so let’s double check the data structure now that all 6 LUs are in one data frame.

We can also check that all the LUs are indeed in the data and all the sites. We should have 6 LUs and 233 sites (155 from 2022-2023 and 78 from 2021-2022)

str(covariates_merged)
## spc_tbl_ [4,660 Ă— 121] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
##  $ array                       : Factor w/ 6 levels "LU13","LU15",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ camera                      : Factor w/ 111 levels "18","15","03",..: 1 2 3 4 5 6 7 8 9 10 ...
##  $ site                        : Factor w/ 233 levels "LU13_18","LU13_15",..: 1 2 3 4 5 6 7 8 9 10 ...
##  $ buff_dist                   : Factor w/ 20 levels "250","500","750",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ vegetated_edge_roads        : num [1:4660] 0 0.0858 0 0 0 ...
##  $ harvest_area                : num [1:4660] 0 0 0.687 0.337 0 ...
##  $ road_gravel_1l              : num [1:4660] 0 0 0 0 0 0 0 0 0 0 ...
##  $ conventional_seismic        : num [1:4660] 0 0.03277 0 0.00889 0.01144 ...
##  $ tame_pasture                : num [1:4660] 0 0 0 0 0 0 0 0 0 0 ...
##  $ pipeline                    : num [1:4660] 0 0.068 0 0 0.0301 ...
##  $ road_gravel_2l              : num [1:4660] 0 0 0 0 0 ...
##  $ trail                       : num [1:4660] 0.00588 0.0028 0 0.00196 0 ...
##  $ well_bitumen                : num [1:4660] 0 0 0 0 0 0 0 0 0 0 ...
##  $ rough_pasture               : num [1:4660] 0 0 0 0 0 0 0 0 0 0 ...
##  $ well_aband                  : num [1:4660] 0 0 0 0 0.0322 ...
##  $ road_unclassified           : num [1:4660] 0 0 0 0 0 0 0 0 0 0 ...
##  $ crop                        : num [1:4660] 0 0 0 0 0 0 0 0 0 0 ...
##  $ low_impact_seismic          : num [1:4660] 0 0 0 0 0.0523 ...
##  $ clearing_unknown            : num [1:4660] 0.0923 0.0697 0 0 0 ...
##  $ cultivation_abandoned       : num [1:4660] 0 0 0 0 0 0 0 0 0 0 ...
##  $ road_paved_undiv_2l         : num [1:4660] 0 0.0174 0 0 0 ...
##  $ road_unimproved             : num [1:4660] 0 0 0 0 0 ...
##  $ truck_trail                 : num [1:4660] 0 0 0 0.0139 0 ...
##  $ dugout                      : num [1:4660] 0 0 0 0 0 0 0 0 0 0 ...
##  $ road_paved_undiv_1l         : num [1:4660] 0 0 0 0 0 0 0 0 0 0 ...
##  $ well_gas                    : num [1:4660] 0 0 0 0 0 0 0 0 0 0 ...
##  $ vegetated_edge_railways     : num [1:4660] 0 0 0 0 0 0 0 0 0 0 ...
##  $ harvest_area_white_zone     : num [1:4660] 0 0 0 0 0 0 0 0 0 0 ...
##  $ country_residence           : num [1:4660] 0 0 0 0 0 0 0 0 0 0 ...
##  $ borrowpit_dry               : num [1:4660] 0 0 0 0 0 0 0 0 0 0 ...
##  $ rural_residence             : num [1:4660] 0 0 0 0 0 0 0 0 0 0 ...
##  $ borrowpit_wet               : num [1:4660] 0 0 0 0 0 0 0 0 0 0 ...
##  $ borrowpits                  : num [1:4660] 0 0 0 0 0 0 0 0 0 0 ...
##  $ grvl_sand_pit               : num [1:4660] 0 0.0873 0 0 0 ...
##  $ ris_reclaimed_temp          : num [1:4660] 0 0.0477 0 0 0 ...
##  $ ris_clearing_unknown        : num [1:4660] 0 0 0 0 0 0 0 0 0 0 ...
##  $ ris_drainage                : num [1:4660] 0 0 0 0 0 0 0 0 0 0 ...
##  $ ris_mines_oilsands          : num [1:4660] 0 0 0 0 0 0 0 0 0 0 ...
##  $ ris_overburden_dump         : num [1:4660] 0 0 0 0 0 0 0 0 0 0 ...
##  $ ris_facility_operations     : num [1:4660] 0 0 0 0 0 ...
##  $ transmission_line           : num [1:4660] 0.0642 0 0 0 0.091 ...
##  $ ris_tailing_pond            : num [1:4660] 0 0 0 0 0 0 0 0 0 0 ...
##  $ clearing_wellpad_unconfirmed: num [1:4660] 0 0 0 0 0 0 0 0 0 0 ...
##  $ mines_oilsands              : num [1:4660] 0 0 0 0 0 0 0 0 0 0 ...
##  $ ris_soil_replaced           : num [1:4660] 0 0 0 0 0 0 0 0 0 0 ...
##  $ road_paved_1l               : num [1:4660] 0 0 0 0 0 0 0 0 0 0 ...
##  $ ris_oilsands_rms            : num [1:4660] 0 0 0 0 0 0 0 0 0 0 ...
##  $ ris_facility_unknown        : num [1:4660] 0 0 0 0 0 0 0 0 0 0 ...
##  $ ris_borrowpits              : num [1:4660] 0 0 0 0 0 0 0 0 0 0 ...
##  $ ris_transmission_line       : num [1:4660] 0 0 0 0 0 0 0 0 0 0 ...
##  $ ris_soil_salvaged           : num [1:4660] 0 0 0 0 0 0 0 0 0 0 ...
##  $ ris_road                    : num [1:4660] 0 0 0 0 0 ...
##  $ ris_plant                   : num [1:4660] 0 0 0 0 0 0 0 0 0 0 ...
##  $ urban_residence             : num [1:4660] 0 0 0 0 0 0 0 0 0 0 ...
##  $ facility_other              : num [1:4660] 0 0 0 0 0 ...
##  $ airp_runway                 : num [1:4660] 0 0 0 0 0 0 0 0 0 0 ...
##  $ runway                      : num [1:4660] 0 0 0 0 0 0 0 0 0 0 ...
##  $ ris_reclaimed_permanent     : num [1:4660] 0 0 0 0 0 0 0 0 0 0 ...
##  $ urban_industrial            : num [1:4660] 0.291 0 0 0 0 ...
##  $ lagoon                      : num [1:4660] 0 0 0 0 0 0 0 0 0 0 ...
##  $ facility_unknown            : num [1:4660] 0 0 0 0 0 0 0 0 0 0 ...
##  $ residence_clearing          : num [1:4660] 0 0 0 0 0 0 0 0 0 0 ...
##  $ well_cased                  : num [1:4660] 0 0 0 0 0 0 0 0 0 0 ...
##  $ road_unpaved_2l             : num [1:4660] 0 0 0 0 0 0 0 0 0 0 ...
##  $ road_paved_3l               : num [1:4660] 0 0 0 0 0 0 0 0 0 0 ...
##  $ surrounding_veg             : num [1:4660] 0 0 0 0 0 0 0 0 0 0 ...
##  $ rlwy_sgl_track              : num [1:4660] 0 0 0 0 0 0 0 0 0 0 ...
##  $ road_winter                 : num [1:4660] 0 0 0 0 0 0 0 0 0 0 ...
##  $ sump                        : num [1:4660] 0 0 0 0 0 ...
##  $ greenspace                  : num [1:4660] 0 0 0 0 0 0 0 0 0 0 ...
##  $ road_paved_2l               : num [1:4660] 0 0 0 0 0 0 0 0 0 0 ...
##  $ well_other                  : num [1:4660] 0 0 0 0 0 0 0 0 0 0 ...
##  $ canal                       : num [1:4660] 0 0 0 0 0 0 0 0 0 0 ...
##  $ reservoir                   : num [1:4660] 0 0 0 0 0 0 0 0 0 0 ...
##  $ well_cleared_not_confirmed  : num [1:4660] 0 0 0 0 0 0 0 0 0 0 ...
##  $ misc_oil_gas_facility       : num [1:4660] 0 0 0 0 0 0 0 0 0 0 ...
##  $ camp_industrial             : num [1:4660] 0 0 0 0 0 0 0 0 0 0 ...
##  $ ris_camp_industrial         : num [1:4660] 0 0 0 0 0 0 0 0 0 0 ...
##  $ oil_gas_plant               : num [1:4660] 0 0 0 0 0 0 0 0 0 0 ...
##  $ well_unknown                : num [1:4660] 0 0 0 0 0 0 0 0 0 0 ...
##  $ ris_utilities               : num [1:4660] 0 0 0 0 0 0 0 0 0 0 ...
##  $ cfo                         : num [1:4660] 0 0 0 0 0 0 0 0 0 0 ...
##  $ recreation                  : num [1:4660] 0 0 0 0 0 0 0 0 0 0 ...
##  $ campground                  : num [1:4660] 0 0 0 0 0 0 0 0 0 0 ...
##  $ peat                        : num [1:4660] 0 0 0 0 0 0 0 0 0 0 ...
##  $ golfcourse                  : num [1:4660] 0 0 0 0 0 0 0 0 0 0 ...
##  $ landfill                    : num [1:4660] 0 0 0 0 0 0 0 0 0 0 ...
##  $ transfer_station            : num [1:4660] 0 0 0 0 0 0 0 0 0 0 ...
##  $ mill                        : num [1:4660] 0 0 0 0 0 0 0 0 0 0 ...
##  $ road_paved_div              : num [1:4660] 0 0 0 0 0 0 0 0 0 0 ...
##  $ rlwy_spur                   : num [1:4660] 0 0 0 0 0 0 0 0 0 0 ...
##  $ well_cleared_not_drilled    : num [1:4660] 0 0 0 0 0 0 0 0 0 0 ...
##  $ open_pit_mine               : num [1:4660] 0 0 0 0 0 0 0 0 0 0 ...
##  $ well_oil                    : num [1:4660] 0 0 0 0 0 0 0 0 0 0 ...
##  $ road_paved_4l               : num [1:4660] 0 0 0 0 0 0 0 0 0 0 ...
##  $ mines_pitlake               : num [1:4660] 0 0 0 0 0 0 0 0 0 0 ...
##  $ ris_reclaimed_certified     : num [1:4660] 0 0 0 0 0 0 0 0 0 0 ...
##  $ ris_windrow                 : num [1:4660] 0 0 0 0 0 0 0 0 0 0 ...
##  $ tailing_pond                : num [1:4660] 0 0 0 0 0 0 0 0 0 0 ...
##   [list output truncated]
##  - attr(*, "spec")=
##   .. cols(
##   ..   .default = col_number(),
##   ..   array = col_factor(levels = NULL, ordered = FALSE, include_na = FALSE),
##   ..   camera = col_factor(levels = NULL, ordered = FALSE, include_na = FALSE),
##   ..   site = col_factor(levels = NULL, ordered = FALSE, include_na = FALSE),
##   ..   buff_dist = col_factor(levels = NULL, ordered = FALSE, include_na = FALSE),
##   ..   vegetated_edge_roads = col_number(),
##   ..   harvest_area = col_number(),
##   ..   road_gravel_1l = col_number(),
##   ..   conventional_seismic = col_number(),
##   ..   tame_pasture = col_number(),
##   ..   pipeline = col_number(),
##   ..   road_gravel_2l = col_number(),
##   ..   trail = col_number(),
##   ..   well_bitumen = col_number(),
##   ..   rough_pasture = col_number(),
##   ..   well_aband = col_number(),
##   ..   road_unclassified = col_number(),
##   ..   crop = col_number(),
##   ..   low_impact_seismic = col_number(),
##   ..   clearing_unknown = col_number(),
##   ..   cultivation_abandoned = col_number(),
##   ..   road_paved_undiv_2l = col_number(),
##   ..   road_unimproved = col_number(),
##   ..   truck_trail = col_number(),
##   ..   dugout = col_number(),
##   ..   road_paved_undiv_1l = col_number(),
##   ..   well_gas = col_number(),
##   ..   vegetated_edge_railways = col_number(),
##   ..   harvest_area_white_zone = col_number(),
##   ..   country_residence = col_number(),
##   ..   borrowpit_dry = col_number(),
##   ..   rural_residence = col_number(),
##   ..   borrowpit_wet = col_number(),
##   ..   borrowpits = col_number(),
##   ..   grvl_sand_pit = col_number(),
##   ..   ris_reclaimed_temp = col_number(),
##   ..   ris_clearing_unknown = col_number(),
##   ..   ris_drainage = col_number(),
##   ..   ris_mines_oilsands = col_number(),
##   ..   ris_overburden_dump = col_number(),
##   ..   ris_facility_operations = col_number(),
##   ..   transmission_line = col_number(),
##   ..   ris_tailing_pond = col_number(),
##   ..   clearing_wellpad_unconfirmed = col_number(),
##   ..   mines_oilsands = col_number(),
##   ..   ris_soil_replaced = col_number(),
##   ..   road_paved_1l = col_number(),
##   ..   ris_oilsands_rms = col_number(),
##   ..   ris_facility_unknown = col_number(),
##   ..   ris_borrowpits = col_number(),
##   ..   ris_transmission_line = col_number(),
##   ..   ris_soil_salvaged = col_number(),
##   ..   ris_road = col_number(),
##   ..   ris_plant = col_number(),
##   ..   urban_residence = col_number(),
##   ..   facility_other = col_number(),
##   ..   airp_runway = col_number(),
##   ..   runway = col_number(),
##   ..   ris_reclaimed_permanent = col_number(),
##   ..   urban_industrial = col_number(),
##   ..   lagoon = col_number(),
##   ..   facility_unknown = col_number(),
##   ..   residence_clearing = col_number(),
##   ..   well_cased = col_number(),
##   ..   road_unpaved_2l = col_number(),
##   ..   road_paved_3l = col_number(),
##   ..   surrounding_veg = col_number(),
##   ..   rlwy_sgl_track = col_number(),
##   ..   road_winter = col_number(),
##   ..   sump = col_number(),
##   ..   greenspace = col_number(),
##   ..   road_paved_2l = col_number(),
##   ..   well_other = col_number(),
##   ..   canal = col_number(),
##   ..   reservoir = col_number(),
##   ..   well_cleared_not_confirmed = col_number(),
##   ..   misc_oil_gas_facility = col_number(),
##   ..   camp_industrial = col_number(),
##   ..   ris_camp_industrial = col_number(),
##   ..   oil_gas_plant = col_number(),
##   ..   well_unknown = col_number(),
##   ..   ris_utilities = col_number(),
##   ..   cfo = col_number(),
##   ..   recreation = col_number(),
##   ..   campground = col_number(),
##   ..   peat = col_number(),
##   ..   golfcourse = col_number(),
##   ..   landfill = col_number(),
##   ..   transfer_station = col_number(),
##   ..   mill = col_number(),
##   ..   road_paved_div = col_number(),
##   ..   rlwy_spur = col_number(),
##   ..   well_cleared_not_drilled = col_number(),
##   ..   open_pit_mine = col_number(),
##   ..   well_oil = col_number(),
##   ..   road_paved_4l = col_number(),
##   ..   mines_pitlake = col_number(),
##   ..   ris_reclaimed_certified = col_number(),
##   ..   ris_windrow = col_number(),
##   ..   tailing_pond = col_number(),
##   ..   rlwy_mlt_track = col_number(),
##   ..   rlwy_dbl_track = col_number(),
##   ..   ris_waste = col_number(),
##   ..   interchange_ramp = col_number(),
##   ..   road_paved_5l = col_number(),
##   ..   ris_airp_runway = col_number(),
##   ..   fruit_vegetables = col_number(),
##   ..   road_unpaved_1l = col_number(),
##   ..   ris_reclaim_ready = col_number(),
##   ..   ris_tank_farm = col_number(),
##   ..   lc_class20 = col_number(),
##   ..   lc_class32 = col_number(),
##   ..   lc_class33 = col_number(),
##   ..   lc_class34 = col_number(),
##   ..   lc_class50 = col_number(),
##   ..   lc_class110 = col_number(),
##   ..   lc_class120 = col_number(),
##   ..   lc_class210 = col_number(),
##   ..   lc_class220 = col_number(),
##   ..   lc_class230 = col_number()
##   .. )
##  - attr(*, "problems")=<externalptr>

Looks like everything read in correctly, I don’t see any missing columns (we won’t need the lab or gridcll column which we can deselect later), and all the arrays (LUs) and sites are accounted for.

NAs

Let’s check the data summary now, we might have NAs for some of the HFI features but otherwise shouldn’t have any for the other variables.

summary(covariates_merged)
##   array         camera          site        buff_dist    vegetated_edge_roads
##  LU13:820   27     : 120   LU13_18:  20   250    : 233   Min.   :0.000000    
##  LU15:780   32     : 120   LU13_15:  20   500    : 233   1st Qu.:0.002604    
##  LU21:720   36     : 120   LU13_03:  20   750    : 233   Median :0.006764    
##  LU01:780   21     : 100   LU13_34:  20   1000   : 233   Mean   :0.010682    
##  LU2 :840   41     : 100   LU13_57:  20   1250   : 233   3rd Qu.:0.013869    
##  LU3 :720   18     :  80   LU13_16:  20   1500   : 233   Max.   :0.147883    
##             (Other):4020   (Other):4540   (Other):3262                       
##   harvest_area     road_gravel_1l     conventional_seismic  tame_pasture      
##  Min.   :0.00000   Min.   :0.000000   Min.   :0.000000     Min.   :0.0000000  
##  1st Qu.:0.00000   1st Qu.:0.000000   1st Qu.:0.003485     1st Qu.:0.0000000  
##  Median :0.00000   Median :0.001385   Median :0.006323     Median :0.0000000  
##  Mean   :0.04720   Mean   :0.002913   Mean   :0.006592     Mean   :0.0008195  
##  3rd Qu.:0.03969   3rd Qu.:0.003689   3rd Qu.:0.009171     3rd Qu.:0.0000000  
##  Max.   :0.83674   Max.   :0.038085   Max.   :0.045512     Max.   :0.1636895  
##                                                                               
##     pipeline       road_gravel_2l          trail            well_bitumen     
##  Min.   :0.00000   Min.   :0.0000000   Min.   :0.0000000   Min.   :0.000000  
##  1st Qu.:0.00000   1st Qu.:0.0000000   1st Qu.:0.0001209   1st Qu.:0.000000  
##  Median :0.01158   Median :0.0000000   Median :0.0007039   Median :0.000000  
##  Mean   :0.01810   Mean   :0.0011075   Mean   :0.0010490   Mean   :0.006039  
##  3rd Qu.:0.02619   3rd Qu.:0.0004745   3rd Qu.:0.0015517   3rd Qu.:0.005144  
##  Max.   :0.28896   Max.   :0.0438815   Max.   :0.0197691   Max.   :0.187398  
##                                                                              
##  rough_pasture         well_aband        road_unclassified  
##  Min.   :0.0000000   Min.   :0.0000000   Min.   :0.000e+00  
##  1st Qu.:0.0000000   1st Qu.:0.0003367   1st Qu.:0.000e+00  
##  Median :0.0000000   Median :0.0019160   Median :0.000e+00  
##  Mean   :0.0002038   Mean   :0.0058542   Mean   :4.093e-06  
##  3rd Qu.:0.0000000   3rd Qu.:0.0093228   3rd Qu.:0.000e+00  
##  Max.   :0.0828324   Max.   :0.3045402   Max.   :8.613e-04  
##                                                             
##       crop           low_impact_seismic clearing_unknown   
##  Min.   :0.000e+00   Min.   :0.000000   Min.   :0.0000000  
##  1st Qu.:0.000e+00   1st Qu.:0.000000   1st Qu.:0.0000000  
##  Median :0.000e+00   Median :0.000000   Median :0.0001542  
##  Mean   :1.469e-06   Mean   :0.005522   Mean   :0.0044589  
##  3rd Qu.:0.000e+00   3rd Qu.:0.004557   3rd Qu.:0.0026457  
##  Max.   :2.571e-03   Max.   :0.087576   Max.   :0.4023522  
##                                                            
##  cultivation_abandoned road_paved_undiv_2l road_unimproved    
##  Min.   :0.000e+00     Min.   :0.0000000   Min.   :0.0000000  
##  1st Qu.:0.000e+00     1st Qu.:0.0000000   1st Qu.:0.0000000  
##  Median :0.000e+00     Median :0.0000000   Median :0.0003318  
##  Mean   :2.547e-05     Mean   :0.0005082   Mean   :0.0016662  
##  3rd Qu.:0.000e+00     3rd Qu.:0.0000000   3rd Qu.:0.0018760  
##  Max.   :3.115e-02     Max.   :0.0431664   Max.   :0.0532898  
##                                                               
##   truck_trail           dugout          road_paved_undiv_1l    well_gas        
##  Min.   :0.000000   Min.   :0.000e+00   Min.   :0.000e+00   Min.   :0.0000000  
##  1st Qu.:0.000000   1st Qu.:0.000e+00   1st Qu.:0.000e+00   1st Qu.:0.0000000  
##  Median :0.000000   Median :0.000e+00   Median :0.000e+00   Median :0.0000000  
##  Mean   :0.000609   Mean   :3.480e-06   Mean   :7.514e-05   Mean   :0.0003188  
##  3rd Qu.:0.000398   3rd Qu.:0.000e+00   3rd Qu.:0.000e+00   3rd Qu.:0.0001151  
##  Max.   :0.038651   Max.   :1.825e-03   Max.   :2.147e-02   Max.   :0.0572117  
##                                                                                
##  vegetated_edge_railways harvest_area_white_zone country_residence  
##  Min.   :0.000e+00       Min.   :0.0000000       Min.   :0.0000000  
##  1st Qu.:0.000e+00       1st Qu.:0.0000000       1st Qu.:0.0000000  
##  Median :0.000e+00       Median :0.0000000       Median :0.0000000  
##  Mean   :8.976e-05       Mean   :0.0002387       Mean   :0.0000608  
##  3rd Qu.:0.000e+00       3rd Qu.:0.0000000       3rd Qu.:0.0000000  
##  Max.   :1.271e-01       Max.   :0.0543438       Max.   :0.0171405  
##                                                                     
##  borrowpit_dry       rural_residence     borrowpit_wet        borrowpits       
##  Min.   :0.0000000   Min.   :0.000e+00   Min.   :0.000000   Min.   :0.0000000  
##  1st Qu.:0.0000000   1st Qu.:0.000e+00   1st Qu.:0.000000   1st Qu.:0.0000000  
##  Median :0.0000000   Median :0.000e+00   Median :0.000000   Median :0.0000000  
##  Mean   :0.0009134   Mean   :5.307e-05   Mean   :0.000642   Mean   :0.0003201  
##  3rd Qu.:0.0003956   3rd Qu.:0.000e+00   3rd Qu.:0.000000   3rd Qu.:0.0000000  
##  Max.   :0.1038665   Max.   :2.805e-02   Max.   :0.271759   Max.   :0.1163709  
##                                                                                
##  grvl_sand_pit      ris_reclaimed_temp ris_clearing_unknown  ris_drainage   
##  Min.   :0.000000   Min.   :0.0000     Min.   :0.0000       Min.   :0.0000  
##  1st Qu.:0.000000   1st Qu.:0.0000     1st Qu.:0.0000       1st Qu.:0.0000  
##  Median :0.000000   Median :0.0000     Median :0.0000       Median :0.0000  
##  Mean   :0.001888   Mean   :0.0002     Mean   :0.0004       Mean   :0.0001  
##  3rd Qu.:0.000000   3rd Qu.:0.0000     3rd Qu.:0.0000       3rd Qu.:0.0000  
##  Max.   :0.557858   Max.   :0.0477     Max.   :0.0494       Max.   :0.0168  
##                     NA's   :1560       NA's   :1560         NA's   :1560    
##  ris_mines_oilsands ris_overburden_dump ris_facility_operations
##  Min.   :0.0000     Min.   :0.0000      Min.   :0.0000         
##  1st Qu.:0.0000     1st Qu.:0.0000      1st Qu.:0.0000         
##  Median :0.0000     Median :0.0000      Median :0.0000         
##  Mean   :0.0001     Mean   :0.0001      Mean   :0.0004         
##  3rd Qu.:0.0000     3rd Qu.:0.0000      3rd Qu.:0.0000         
##  Max.   :0.0567     Max.   :0.0211      Max.   :0.1274         
##  NA's   :1560       NA's   :1560        NA's   :1560           
##  transmission_line  ris_tailing_pond clearing_wellpad_unconfirmed
##  Min.   :0.000000   Min.   :0.0000   Min.   :0.0000000           
##  1st Qu.:0.000000   1st Qu.:0.0000   1st Qu.:0.0000000           
##  Median :0.000000   Median :0.0000   Median :0.0000000           
##  Mean   :0.004601   Mean   :0.0012   Mean   :0.0003592           
##  3rd Qu.:0.004977   3rd Qu.:0.0000   3rd Qu.:0.0003713           
##  Max.   :0.173950   Max.   :0.1738   Max.   :0.0723607           
##                     NA's   :1560                                 
##  mines_oilsands   ris_soil_replaced road_paved_1l ris_oilsands_rms
##  Min.   :0.0000   Min.   :0.0000    Min.   :0     Min.   :0.0000  
##  1st Qu.:0.0000   1st Qu.:0.0000    1st Qu.:0     1st Qu.:0.0000  
##  Median :0.0000   Median :0.0000    Median :0     Median :0.0000  
##  Mean   :0.0009   Mean   :0.0002    Mean   :0     Mean   :0.0002  
##  3rd Qu.:0.0000   3rd Qu.:0.0000    3rd Qu.:0     3rd Qu.:0.0000  
##  Max.   :0.1223   Max.   :0.0245    Max.   :0     Max.   :0.0335  
##  NA's   :1560     NA's   :1560                    NA's   :1560    
##  ris_facility_unknown ris_borrowpits   ris_transmission_line ris_soil_salvaged
##  Min.   :0            Min.   :0.0000   Min.   :0.0000        Min.   :0.0000   
##  1st Qu.:0            1st Qu.:0.0000   1st Qu.:0.0000        1st Qu.:0.0000   
##  Median :0            Median :0.0000   Median :0.0000        Median :0.0000   
##  Mean   :0            Mean   :0.0000   Mean   :0.0000        Mean   :0.0001   
##  3rd Qu.:0            3rd Qu.:0.0000   3rd Qu.:0.0000        3rd Qu.:0.0000   
##  Max.   :0            Max.   :0.0051   Max.   :0.0027        Max.   :0.0415   
##  NA's   :1560         NA's   :1560     NA's   :1560          NA's   :1560     
##     ris_road        ris_plant    urban_residence     facility_other     
##  Min.   :0.0000   Min.   :0      Min.   :0.000e+00   Min.   :0.0000000  
##  1st Qu.:0.0000   1st Qu.:0      1st Qu.:0.000e+00   1st Qu.:0.0000000  
##  Median :0.0000   Median :0      Median :0.000e+00   Median :0.0000000  
##  Mean   :0.0002   Mean   :0      Mean   :4.099e-05   Mean   :0.0007405  
##  3rd Qu.:0.0000   3rd Qu.:0      3rd Qu.:0.000e+00   3rd Qu.:0.0000000  
##  Max.   :0.0218   Max.   :0      Max.   :1.157e-02   Max.   :0.2009920  
##  NA's   :1560     NA's   :1560                                          
##   airp_runway     runway          ris_reclaimed_permanent urban_industrial  
##  Min.   :0    Min.   :0.000e+00   Min.   :0.0000          Min.   :0.000000  
##  1st Qu.:0    1st Qu.:0.000e+00   1st Qu.:0.0000          1st Qu.:0.000000  
##  Median :0    Median :0.000e+00   Median :0.0000          Median :0.000000  
##  Mean   :0    Mean   :3.529e-05   Mean   :0.0006          Mean   :0.001092  
##  3rd Qu.:0    3rd Qu.:0.000e+00   3rd Qu.:0.0000          3rd Qu.:0.000000  
##  Max.   :0    Max.   :1.525e-02   Max.   :0.0535          Max.   :0.335749  
##                                   NA's   :1560                              
##      lagoon          facility_unknown    residence_clearing 
##  Min.   :0.0000000   Min.   :0.0000000   Min.   :0.000e+00  
##  1st Qu.:0.0000000   1st Qu.:0.0000000   1st Qu.:0.000e+00  
##  Median :0.0000000   Median :0.0000000   Median :0.000e+00  
##  Mean   :0.0001343   Mean   :0.0001777   Mean   :7.892e-06  
##  3rd Qu.:0.0000000   3rd Qu.:0.0000000   3rd Qu.:0.000e+00  
##  Max.   :0.0218390   Max.   :0.1379450   Max.   :3.113e-03  
##                                                             
##    well_cased        road_unpaved_2l road_paved_3l  surrounding_veg    
##  Min.   :0.0000000   Min.   :0       Min.   :0      Min.   :0.000e+00  
##  1st Qu.:0.0000000   1st Qu.:0       1st Qu.:0      1st Qu.:0.000e+00  
##  Median :0.0000000   Median :0       Median :0      Median :0.000e+00  
##  Mean   :0.0005716   Mean   :0       Mean   :0      Mean   :9.553e-05  
##  3rd Qu.:0.0001940   3rd Qu.:0       3rd Qu.:0      3rd Qu.:0.000e+00  
##  Max.   :0.0685807   Max.   :0       Max.   :0      Max.   :8.209e-02  
##                      NA's   :1560    NA's   :1560                      
##  rlwy_sgl_track      road_winter        sump            greenspace       
##  Min.   :0.00e+00   Min.   :0      Min.   :0.000000   Min.   :0.000e+00  
##  1st Qu.:0.00e+00   1st Qu.:0      1st Qu.:0.000000   1st Qu.:0.000e+00  
##  Median :0.00e+00   Median :0      Median :0.000000   Median :0.000e+00  
##  Mean   :2.92e-05   Mean   :0      Mean   :0.002142   Mean   :1.466e-05  
##  3rd Qu.:0.00e+00   3rd Qu.:0      3rd Qu.:0.001785   3rd Qu.:0.000e+00  
##  Max.   :2.44e-02   Max.   :0      Max.   :0.311103   Max.   :3.028e-03  
##                     NA's   :1560                                         
##  road_paved_2l   well_other           canal             reservoir        
##  Min.   :0     Min.   :0.000000   Min.   :0.0000000   Min.   :0.000e+00  
##  1st Qu.:0     1st Qu.:0.000000   1st Qu.:0.0000000   1st Qu.:0.000e+00  
##  Median :0     Median :0.000000   Median :0.0000000   Median :0.000e+00  
##  Mean   :0     Mean   :0.001548   Mean   :0.0000167   Mean   :8.339e-06  
##  3rd Qu.:0     3rd Qu.:0.001006   3rd Qu.:0.0000000   3rd Qu.:0.000e+00  
##  Max.   :0     Max.   :0.116479   Max.   :0.0196060   Max.   :7.894e-03  
##                                                                          
##  well_cleared_not_confirmed misc_oil_gas_facility camp_industrial    
##  Min.   :0.0000000          Min.   :0.0000000     Min.   :0.0000000  
##  1st Qu.:0.0000000          1st Qu.:0.0000000     1st Qu.:0.0000000  
##  Median :0.0000000          Median :0.0000000     Median :0.0000000  
##  Mean   :0.0002716          Mean   :0.0031465     Mean   :0.0005956  
##  3rd Qu.:0.0000000          3rd Qu.:0.0006169     3rd Qu.:0.0000000  
##  Max.   :0.0829690          Max.   :0.3449713     Max.   :0.2450556  
##                                                                      
##  ris_camp_industrial oil_gas_plant       well_unknown       ris_utilities   
##  Min.   :0           Min.   :0.000000   Min.   :0.000e+00   Min.   :0.0000  
##  1st Qu.:0           1st Qu.:0.000000   1st Qu.:0.000e+00   1st Qu.:0.0000  
##  Median :0           Median :0.000000   Median :0.000e+00   Median :0.0000  
##  Mean   :0           Mean   :0.001106   Mean   :3.274e-05   Mean   :0.0000  
##  3rd Qu.:0           3rd Qu.:0.000000   3rd Qu.:0.000e+00   3rd Qu.:0.0000  
##  Max.   :0           Max.   :0.175037   Max.   :4.813e-03   Max.   :0.0025  
##  NA's   :1560                                               NA's   :1560    
##       cfo           recreation   campground            peat        golfcourse
##  Min.   :0.0000   Min.   :0    Min.   :0.000000   Min.   :0      Min.   :0   
##  1st Qu.:0.0000   1st Qu.:0    1st Qu.:0.000000   1st Qu.:0      1st Qu.:0   
##  Median :0.0000   Median :0    Median :0.000000   Median :0      Median :0   
##  Mean   :0.0000   Mean   :0    Mean   :0.000103   Mean   :0      Mean   :0   
##  3rd Qu.:0.0000   3rd Qu.:0    3rd Qu.:0.000000   3rd Qu.:0      3rd Qu.:0   
##  Max.   :0.0012   Max.   :0    Max.   :0.028966   Max.   :0      Max.   :0   
##  NA's   :1560                                     NA's   :1560               
##     landfill transfer_station      mill   road_paved_div     rlwy_spur
##  Min.   :0   Min.   :0        Min.   :0   Min.   :0.0000   Min.   :0  
##  1st Qu.:0   1st Qu.:0        1st Qu.:0   1st Qu.:0.0000   1st Qu.:0  
##  Median :0   Median :0        Median :0   Median :0.0000   Median :0  
##  Mean   :0   Mean   :0        Mean   :0   Mean   :0.0000   Mean   :0  
##  3rd Qu.:0   3rd Qu.:0        3rd Qu.:0   3rd Qu.:0.0000   3rd Qu.:0  
##  Max.   :0   Max.   :0        Max.   :0   Max.   :0.0019   Max.   :0  
##                                           NA's   :1560                
##  well_cleared_not_drilled open_pit_mine          well_oil    road_paved_4l 
##  Min.   :0.000e+00        Min.   :0.0000000   Min.   :0      Min.   :0     
##  1st Qu.:0.000e+00        1st Qu.:0.0000000   1st Qu.:0      1st Qu.:0     
##  Median :0.000e+00        Median :0.0000000   Median :0      Median :0     
##  Mean   :2.143e-05        Mean   :0.0005218   Mean   :0      Mean   :0     
##  3rd Qu.:0.000e+00        3rd Qu.:0.0000000   3rd Qu.:0      3rd Qu.:0     
##  Max.   :1.469e-02        Max.   :0.0641059   Max.   :0      Max.   :0     
##                                               NA's   :1560   NA's   :1560  
##  mines_pitlake ris_reclaimed_certified  ris_windrow     tailing_pond  
##  Min.   :0     Min.   :0               Min.   :0.000   Min.   :0.000  
##  1st Qu.:0     1st Qu.:0               1st Qu.:0.000   1st Qu.:0.000  
##  Median :0     Median :0               Median :0.000   Median :0.000  
##  Mean   :0     Mean   :0               Mean   :0.000   Mean   :0.000  
##  3rd Qu.:0     3rd Qu.:0               3rd Qu.:0.000   3rd Qu.:0.000  
##  Max.   :0     Max.   :0               Max.   :0.016   Max.   :0.004  
##                NA's   :1560            NA's   :1560    NA's   :1560   
##  rlwy_mlt_track rlwy_dbl_track   ris_waste    interchange_ramp road_paved_5l 
##  Min.   :0      Min.   :0      Min.   :0      Min.   :0        Min.   :0     
##  1st Qu.:0      1st Qu.:0      1st Qu.:0      1st Qu.:0        1st Qu.:0     
##  Median :0      Median :0      Median :0      Median :0        Median :0     
##  Mean   :0      Mean   :0      Mean   :0      Mean   :0        Mean   :0     
##  3rd Qu.:0      3rd Qu.:0      3rd Qu.:0      3rd Qu.:0        3rd Qu.:0     
##  Max.   :0      Max.   :0      Max.   :0      Max.   :0        Max.   :0     
##  NA's   :1560   NA's   :1560   NA's   :1560   NA's   :1560     NA's   :1560  
##  ris_airp_runway fruit_vegetables road_unpaved_1l ris_reclaim_ready
##  Min.   :0       Min.   :0        Min.   :0       Min.   :0        
##  1st Qu.:0       1st Qu.:0        1st Qu.:0       1st Qu.:0        
##  Median :0       Median :0        Median :0       Median :0        
##  Mean   :0       Mean   :0        Mean   :0       Mean   :0        
##  3rd Qu.:0       3rd Qu.:0        3rd Qu.:0       3rd Qu.:0        
##  Max.   :0       Max.   :0        Max.   :0       Max.   :0        
##  NA's   :1560    NA's   :1560     NA's   :1560    NA's   :1560     
##  ris_tank_farm    lc_class20         lc_class32       lc_class33      
##  Min.   :0      Min.   :0.000000   Min.   :0.0000   Min.   :0.000000  
##  1st Qu.:0      1st Qu.:0.000000   1st Qu.:0.0000   1st Qu.:0.000000  
##  Median :0      Median :0.002998   Median :0.0000   Median :0.000000  
##  Mean   :0      Mean   :0.028712   Mean   :0.0000   Mean   :0.003658  
##  3rd Qu.:0      3rd Qu.:0.032240   3rd Qu.:0.0000   3rd Qu.:0.000000  
##  Max.   :0      Max.   :0.519648   Max.   :0.0118   Max.   :0.324028  
##  NA's   :1560                      NA's   :1560                       
##    lc_class34        lc_class50       lc_class110        lc_class120       
##  Min.   :0.00000   Min.   :0.00000   Min.   :0.000000   Min.   :0.0000000  
##  1st Qu.:0.00000   1st Qu.:0.01039   1st Qu.:0.009894   1st Qu.:0.0000000  
##  Median :0.01762   Median :0.04435   Median :0.036593   Median :0.0000000  
##  Mean   :0.03201   Mean   :0.09125   Mean   :0.046553   Mean   :0.0008668  
##  3rd Qu.:0.04129   3rd Qu.:0.10789   3rd Qu.:0.061983   3rd Qu.:0.0000000  
##  Max.   :0.55710   Max.   :1.00000   Max.   :0.731887   Max.   :0.1654590  
##                                                                            
##   lc_class210      lc_class220        lc_class230         gridcll     
##  Min.   :0.0000   Min.   :0.000000   Min.   :0.00000   Min.   :2.000  
##  1st Qu.:0.2869   1st Qu.:0.008999   1st Qu.:0.02228   1st Qu.:2.000  
##  Median :0.5748   Median :0.071814   Median :0.06057   Median :2.000  
##  Mean   :0.5346   Mean   :0.140469   Mean   :0.12188   Mean   :2.462  
##  3rd Qu.:0.7873   3rd Qu.:0.230000   3rd Qu.:0.17370   3rd Qu.:3.000  
##  Max.   :1.0000   Max.   :1.000000   Max.   :0.93217   Max.   :3.000  
##                                                        NA's   :3100   
##       lab      
##  Min.   : NA   
##  1st Qu.: NA   
##  Median : NA   
##  Mean   :NaN   
##  3rd Qu.: NA   
##  Max.   : NA   
##  NA's   :4660

This looks good, we will want to replace the NAs with zeros during data formatting because the only reason we have NAs is because there weren’t any of those features in the other data file, and since these calculate proportions of each feature that would make the proportion zero, and we don’t want to lose the other data for those sites.

Data formatting

This section will need to be altered year-to-year to accommodate various issues that are unique to each year, but offers a good starting point.

I like to do as much of my data manipulation I can in one dplyr pipe (i.e. code chunk) to avoid extra coding and assigning intermediate objects to the environment that I don’t need, but if this format doesn’t make sense to you, each step can be done individually if you pull the code out of the pipeline and reference the data within each function. I do write each step individually and check that it’s working correctly as I go.

In the code chunk below I,

  1. remove the camera, lab, and gridcll columns we don’t need
  2. reorder columns alphabetically except array, site, and buffer_dist which will be at the front
  3. replace NAs with zeros

Then we run summary to check that everything worked. (If you have other formatting to do you may need to use other functions to check that everything worked)

covariates_fixed <- covariates_merged %>% 
  
  # remove columns we won't use anymore
  select(!c(camera, 
            gridcll,
            lab)) %>%  
  
  # order columns alphabetically
  select(order(colnames(.))) %>% 
  
  # we want to move the columns that aren't HFI features or landcover to the front
  relocate(.,
           c(array,
             site,
             buff_dist)) %>% 
  
  # replace NAs introduced from joining data to zeros
  replace(is.na(.),
          0)

# check that everything looks good  
summary(covariates_fixed)
##   array          site        buff_dist     airp_runway borrowpit_dry      
##  LU13:820   LU13_18:  20   250    : 233   Min.   :0    Min.   :0.0000000  
##  LU15:780   LU13_15:  20   500    : 233   1st Qu.:0    1st Qu.:0.0000000  
##  LU21:720   LU13_03:  20   750    : 233   Median :0    Median :0.0000000  
##  LU01:780   LU13_34:  20   1000   : 233   Mean   :0    Mean   :0.0009134  
##  LU2 :840   LU13_57:  20   1250   : 233   3rd Qu.:0    3rd Qu.:0.0003956  
##  LU3 :720   LU13_16:  20   1500   : 233   Max.   :0    Max.   :0.1038665  
##             (Other):4540   (Other):3262                                   
##  borrowpit_wet        borrowpits        camp_industrial       campground      
##  Min.   :0.000000   Min.   :0.0000000   Min.   :0.0000000   Min.   :0.000000  
##  1st Qu.:0.000000   1st Qu.:0.0000000   1st Qu.:0.0000000   1st Qu.:0.000000  
##  Median :0.000000   Median :0.0000000   Median :0.0000000   Median :0.000000  
##  Mean   :0.000642   Mean   :0.0003201   Mean   :0.0005956   Mean   :0.000103  
##  3rd Qu.:0.000000   3rd Qu.:0.0000000   3rd Qu.:0.0000000   3rd Qu.:0.000000  
##  Max.   :0.271759   Max.   :0.1163709   Max.   :0.2450556   Max.   :0.028966  
##                                                                               
##      canal                cfo            clearing_unknown   
##  Min.   :0.0000000   Min.   :0.000e+00   Min.   :0.0000000  
##  1st Qu.:0.0000000   1st Qu.:0.000e+00   1st Qu.:0.0000000  
##  Median :0.0000000   Median :0.000e+00   Median :0.0001542  
##  Mean   :0.0000167   Mean   :5.398e-07   Mean   :0.0044589  
##  3rd Qu.:0.0000000   3rd Qu.:0.000e+00   3rd Qu.:0.0026457  
##  Max.   :0.0196060   Max.   :1.217e-03   Max.   :0.4023522  
##                                                             
##  clearing_wellpad_unconfirmed conventional_seismic country_residence  
##  Min.   :0.0000000            Min.   :0.000000     Min.   :0.0000000  
##  1st Qu.:0.0000000            1st Qu.:0.003485     1st Qu.:0.0000000  
##  Median :0.0000000            Median :0.006323     Median :0.0000000  
##  Mean   :0.0003592            Mean   :0.006592     Mean   :0.0000608  
##  3rd Qu.:0.0003713            3rd Qu.:0.009171     3rd Qu.:0.0000000  
##  Max.   :0.0723607            Max.   :0.045512     Max.   :0.0171405  
##                                                                       
##       crop           cultivation_abandoned     dugout         
##  Min.   :0.000e+00   Min.   :0.000e+00     Min.   :0.000e+00  
##  1st Qu.:0.000e+00   1st Qu.:0.000e+00     1st Qu.:0.000e+00  
##  Median :0.000e+00   Median :0.000e+00     Median :0.000e+00  
##  Mean   :1.469e-06   Mean   :2.547e-05     Mean   :3.480e-06  
##  3rd Qu.:0.000e+00   3rd Qu.:0.000e+00     3rd Qu.:0.000e+00  
##  Max.   :2.571e-03   Max.   :3.115e-02     Max.   :1.825e-03  
##                                                               
##  facility_other      facility_unknown    fruit_vegetables   golfcourse
##  Min.   :0.0000000   Min.   :0.0000000   Min.   :0        Min.   :0   
##  1st Qu.:0.0000000   1st Qu.:0.0000000   1st Qu.:0        1st Qu.:0   
##  Median :0.0000000   Median :0.0000000   Median :0        Median :0   
##  Mean   :0.0007405   Mean   :0.0001777   Mean   :0        Mean   :0   
##  3rd Qu.:0.0000000   3rd Qu.:0.0000000   3rd Qu.:0        3rd Qu.:0   
##  Max.   :0.2009920   Max.   :0.1379450   Max.   :0        Max.   :0   
##                                                                       
##    greenspace        grvl_sand_pit       harvest_area    
##  Min.   :0.000e+00   Min.   :0.000000   Min.   :0.00000  
##  1st Qu.:0.000e+00   1st Qu.:0.000000   1st Qu.:0.00000  
##  Median :0.000e+00   Median :0.000000   Median :0.00000  
##  Mean   :1.466e-05   Mean   :0.001888   Mean   :0.04720  
##  3rd Qu.:0.000e+00   3rd Qu.:0.000000   3rd Qu.:0.03969  
##  Max.   :3.028e-03   Max.   :0.557858   Max.   :0.83674  
##                                                          
##  harvest_area_white_zone interchange_ramp     lagoon             landfill
##  Min.   :0.0000000       Min.   :0        Min.   :0.0000000   Min.   :0  
##  1st Qu.:0.0000000       1st Qu.:0        1st Qu.:0.0000000   1st Qu.:0  
##  Median :0.0000000       Median :0        Median :0.0000000   Median :0  
##  Mean   :0.0002387       Mean   :0        Mean   :0.0001343   Mean   :0  
##  3rd Qu.:0.0000000       3rd Qu.:0        3rd Qu.:0.0000000   3rd Qu.:0  
##  Max.   :0.0543438       Max.   :0        Max.   :0.0218390   Max.   :0  
##                                                                          
##   lc_class110        lc_class120          lc_class20        lc_class210    
##  Min.   :0.000000   Min.   :0.0000000   Min.   :0.000000   Min.   :0.0000  
##  1st Qu.:0.009894   1st Qu.:0.0000000   1st Qu.:0.000000   1st Qu.:0.2869  
##  Median :0.036593   Median :0.0000000   Median :0.002998   Median :0.5748  
##  Mean   :0.046553   Mean   :0.0008668   Mean   :0.028712   Mean   :0.5346  
##  3rd Qu.:0.061983   3rd Qu.:0.0000000   3rd Qu.:0.032240   3rd Qu.:0.7873  
##  Max.   :0.731887   Max.   :0.1654590   Max.   :0.519648   Max.   :1.0000  
##                                                                            
##   lc_class220        lc_class230        lc_class32          lc_class33      
##  Min.   :0.000000   Min.   :0.00000   Min.   :0.000e+00   Min.   :0.000000  
##  1st Qu.:0.008999   1st Qu.:0.02228   1st Qu.:0.000e+00   1st Qu.:0.000000  
##  Median :0.071814   Median :0.06057   Median :0.000e+00   Median :0.000000  
##  Mean   :0.140469   Mean   :0.12188   Mean   :1.163e-05   Mean   :0.003658  
##  3rd Qu.:0.230000   3rd Qu.:0.17370   3rd Qu.:0.000e+00   3rd Qu.:0.000000  
##  Max.   :1.000000   Max.   :0.93217   Max.   :1.176e-02   Max.   :0.324028  
##                                                                             
##    lc_class34        lc_class50      low_impact_seismic      mill  
##  Min.   :0.00000   Min.   :0.00000   Min.   :0.000000   Min.   :0  
##  1st Qu.:0.00000   1st Qu.:0.01039   1st Qu.:0.000000   1st Qu.:0  
##  Median :0.01762   Median :0.04435   Median :0.000000   Median :0  
##  Mean   :0.03201   Mean   :0.09125   Mean   :0.005522   Mean   :0  
##  3rd Qu.:0.04129   3rd Qu.:0.10789   3rd Qu.:0.004557   3rd Qu.:0  
##  Max.   :0.55710   Max.   :1.00000   Max.   :0.087576   Max.   :0  
##                                                                    
##  mines_oilsands      mines_pitlake misc_oil_gas_facility oil_gas_plant     
##  Min.   :0.0000000   Min.   :0     Min.   :0.0000000     Min.   :0.000000  
##  1st Qu.:0.0000000   1st Qu.:0     1st Qu.:0.0000000     1st Qu.:0.000000  
##  Median :0.0000000   Median :0     Median :0.0000000     Median :0.000000  
##  Mean   :0.0005971   Mean   :0     Mean   :0.0031465     Mean   :0.001106  
##  3rd Qu.:0.0000000   3rd Qu.:0     3rd Qu.:0.0006169     3rd Qu.:0.000000  
##  Max.   :0.1223456   Max.   :0     Max.   :0.3449713     Max.   :0.175037  
##                                                                            
##  open_pit_mine            peat      pipeline         recreation
##  Min.   :0.0000000   Min.   :0   Min.   :0.00000   Min.   :0   
##  1st Qu.:0.0000000   1st Qu.:0   1st Qu.:0.00000   1st Qu.:0   
##  Median :0.0000000   Median :0   Median :0.01158   Median :0   
##  Mean   :0.0005218   Mean   :0   Mean   :0.01810   Mean   :0   
##  3rd Qu.:0.0000000   3rd Qu.:0   3rd Qu.:0.02619   3rd Qu.:0   
##  Max.   :0.0641059   Max.   :0   Max.   :0.28896   Max.   :0   
##                                                                
##    reservoir         residence_clearing  ris_airp_runway ris_borrowpits     
##  Min.   :0.000e+00   Min.   :0.000e+00   Min.   :0       Min.   :0.000e+00  
##  1st Qu.:0.000e+00   1st Qu.:0.000e+00   1st Qu.:0       1st Qu.:0.000e+00  
##  Median :0.000e+00   Median :0.000e+00   Median :0       Median :0.000e+00  
##  Mean   :8.339e-06   Mean   :7.892e-06   Mean   :0       Mean   :1.984e-05  
##  3rd Qu.:0.000e+00   3rd Qu.:0.000e+00   3rd Qu.:0       3rd Qu.:0.000e+00  
##  Max.   :7.894e-03   Max.   :3.113e-03   Max.   :0       Max.   :5.063e-03  
##                                                                             
##  ris_camp_industrial ris_clearing_unknown  ris_drainage      
##  Min.   :0           Min.   :0.0000000    Min.   :0.000e+00  
##  1st Qu.:0           1st Qu.:0.0000000    1st Qu.:0.000e+00  
##  Median :0           Median :0.0000000    Median :0.000e+00  
##  Mean   :0           Mean   :0.0002653    Mean   :5.813e-05  
##  3rd Qu.:0           3rd Qu.:0.0000000    3rd Qu.:0.000e+00  
##  Max.   :0           Max.   :0.0493557    Max.   :1.682e-02  
##                                                              
##  ris_facility_operations ris_facility_unknown ris_mines_oilsands 
##  Min.   :0.0000000       Min.   :0.000e+00    Min.   :0.000e+00  
##  1st Qu.:0.0000000       1st Qu.:0.000e+00    1st Qu.:0.000e+00  
##  Median :0.0000000       Median :0.000e+00    Median :0.000e+00  
##  Mean   :0.0002401       Mean   :3.345e-08    Mean   :5.357e-05  
##  3rd Qu.:0.0000000       3rd Qu.:0.000e+00    3rd Qu.:0.000e+00  
##  Max.   :0.1274343       Max.   :2.780e-05    Max.   :5.667e-02  
##                                                                  
##  ris_oilsands_rms    ris_overburden_dump   ris_plant ris_reclaim_ready
##  Min.   :0.0000000   Min.   :0.000e+00   Min.   :0   Min.   :0        
##  1st Qu.:0.0000000   1st Qu.:0.000e+00   1st Qu.:0   1st Qu.:0        
##  Median :0.0000000   Median :0.000e+00   Median :0   Median :0        
##  Mean   :0.0001467   Mean   :9.603e-05   Mean   :0   Mean   :0        
##  3rd Qu.:0.0000000   3rd Qu.:0.000e+00   3rd Qu.:0   3rd Qu.:0        
##  Max.   :0.0334971   Max.   :2.111e-02   Max.   :0   Max.   :0        
##                                                                       
##  ris_reclaimed_certified ris_reclaimed_permanent ris_reclaimed_temp 
##  Min.   :0               Min.   :0.0000000       Min.   :0.0000000  
##  1st Qu.:0               1st Qu.:0.0000000       1st Qu.:0.0000000  
##  Median :0               Median :0.0000000       Median :0.0000000  
##  Mean   :0               Mean   :0.0004046       Mean   :0.0001344  
##  3rd Qu.:0               3rd Qu.:0.0000000       3rd Qu.:0.0000000  
##  Max.   :0               Max.   :0.0534939       Max.   :0.0476953  
##                                                                     
##     ris_road         ris_soil_replaced   ris_soil_salvaged  
##  Min.   :0.0000000   Min.   :0.0000000   Min.   :0.0000000  
##  1st Qu.:0.0000000   1st Qu.:0.0000000   1st Qu.:0.0000000  
##  Median :0.0000000   Median :0.0000000   Median :0.0000000  
##  Mean   :0.0001202   Mean   :0.0001057   Mean   :0.0000938  
##  3rd Qu.:0.0000000   3rd Qu.:0.0000000   3rd Qu.:0.0000000  
##  Max.   :0.0218055   Max.   :0.0244751   Max.   :0.0414762  
##                                                             
##  ris_tailing_pond    ris_tank_farm ris_transmission_line ris_utilities      
##  Min.   :0.0000000   Min.   :0     Min.   :0.000e+00     Min.   :0.000e+00  
##  1st Qu.:0.0000000   1st Qu.:0     1st Qu.:0.000e+00     1st Qu.:0.000e+00  
##  Median :0.0000000   Median :0     Median :0.000e+00     Median :0.000e+00  
##  Mean   :0.0007656   Mean   :0     Mean   :6.526e-06     Mean   :5.082e-06  
##  3rd Qu.:0.0000000   3rd Qu.:0     3rd Qu.:0.000e+00     3rd Qu.:0.000e+00  
##  Max.   :0.1738171   Max.   :0     Max.   :2.667e-03     Max.   :2.539e-03  
##                                                                             
##    ris_waste  ris_windrow        rlwy_dbl_track rlwy_mlt_track
##  Min.   :0   Min.   :0.000e+00   Min.   :0      Min.   :0     
##  1st Qu.:0   1st Qu.:0.000e+00   1st Qu.:0      1st Qu.:0     
##  Median :0   Median :0.000e+00   Median :0      Median :0     
##  Mean   :0   Mean   :2.231e-05   Mean   :0      Mean   :0     
##  3rd Qu.:0   3rd Qu.:0.000e+00   3rd Qu.:0      3rd Qu.:0     
##  Max.   :0   Max.   :1.595e-02   Max.   :0      Max.   :0     
##                                                               
##  rlwy_sgl_track       rlwy_spur road_gravel_1l     road_gravel_2l     
##  Min.   :0.00e+00   Min.   :0   Min.   :0.000000   Min.   :0.0000000  
##  1st Qu.:0.00e+00   1st Qu.:0   1st Qu.:0.000000   1st Qu.:0.0000000  
##  Median :0.00e+00   Median :0   Median :0.001385   Median :0.0000000  
##  Mean   :2.92e-05   Mean   :0   Mean   :0.002913   Mean   :0.0011075  
##  3rd Qu.:0.00e+00   3rd Qu.:0   3rd Qu.:0.003689   3rd Qu.:0.0004745  
##  Max.   :2.44e-02   Max.   :0   Max.   :0.038085   Max.   :0.0438815  
##                                                                       
##  road_paved_1l road_paved_2l road_paved_3l road_paved_4l road_paved_5l
##  Min.   :0     Min.   :0     Min.   :0     Min.   :0     Min.   :0    
##  1st Qu.:0     1st Qu.:0     1st Qu.:0     1st Qu.:0     1st Qu.:0    
##  Median :0     Median :0     Median :0     Median :0     Median :0    
##  Mean   :0     Mean   :0     Mean   :0     Mean   :0     Mean   :0    
##  3rd Qu.:0     3rd Qu.:0     3rd Qu.:0     3rd Qu.:0     3rd Qu.:0    
##  Max.   :0     Max.   :0     Max.   :0     Max.   :0     Max.   :0    
##                                                                       
##  road_paved_div      road_paved_undiv_1l road_paved_undiv_2l
##  Min.   :0.000e+00   Min.   :0.000e+00   Min.   :0.0000000  
##  1st Qu.:0.000e+00   1st Qu.:0.000e+00   1st Qu.:0.0000000  
##  Median :0.000e+00   Median :0.000e+00   Median :0.0000000  
##  Mean   :4.504e-06   Mean   :7.514e-05   Mean   :0.0005082  
##  3rd Qu.:0.000e+00   3rd Qu.:0.000e+00   3rd Qu.:0.0000000  
##  Max.   :1.936e-03   Max.   :2.147e-02   Max.   :0.0431664  
##                                                             
##  road_unclassified   road_unimproved     road_unpaved_1l road_unpaved_2l
##  Min.   :0.000e+00   Min.   :0.0000000   Min.   :0       Min.   :0      
##  1st Qu.:0.000e+00   1st Qu.:0.0000000   1st Qu.:0       1st Qu.:0      
##  Median :0.000e+00   Median :0.0003318   Median :0       Median :0      
##  Mean   :4.093e-06   Mean   :0.0016662   Mean   :0       Mean   :0      
##  3rd Qu.:0.000e+00   3rd Qu.:0.0018760   3rd Qu.:0       3rd Qu.:0      
##  Max.   :8.613e-04   Max.   :0.0532898   Max.   :0       Max.   :0      
##                                                                         
##   road_winter rough_pasture           runway          rural_residence    
##  Min.   :0    Min.   :0.0000000   Min.   :0.000e+00   Min.   :0.000e+00  
##  1st Qu.:0    1st Qu.:0.0000000   1st Qu.:0.000e+00   1st Qu.:0.000e+00  
##  Median :0    Median :0.0000000   Median :0.000e+00   Median :0.000e+00  
##  Mean   :0    Mean   :0.0002038   Mean   :3.529e-05   Mean   :5.307e-05  
##  3rd Qu.:0    3rd Qu.:0.0000000   3rd Qu.:0.000e+00   3rd Qu.:0.000e+00  
##  Max.   :0    Max.   :0.0828324   Max.   :1.525e-02   Max.   :2.805e-02  
##                                                                          
##       sump          surrounding_veg      tailing_pond        tame_pasture      
##  Min.   :0.000000   Min.   :0.000e+00   Min.   :0.000e+00   Min.   :0.0000000  
##  1st Qu.:0.000000   1st Qu.:0.000e+00   1st Qu.:0.000e+00   1st Qu.:0.0000000  
##  Median :0.000000   Median :0.000e+00   Median :0.000e+00   Median :0.0000000  
##  Mean   :0.002142   Mean   :9.553e-05   Mean   :1.353e-05   Mean   :0.0008195  
##  3rd Qu.:0.001785   3rd Qu.:0.000e+00   3rd Qu.:0.000e+00   3rd Qu.:0.0000000  
##  Max.   :0.311103   Max.   :8.209e-02   Max.   :4.008e-03   Max.   :0.1636895  
##                                                                                
##      trail           transfer_station transmission_line   truck_trail      
##  Min.   :0.0000000   Min.   :0        Min.   :0.000000   Min.   :0.000000  
##  1st Qu.:0.0001209   1st Qu.:0        1st Qu.:0.000000   1st Qu.:0.000000  
##  Median :0.0007039   Median :0        Median :0.000000   Median :0.000000  
##  Mean   :0.0010490   Mean   :0        Mean   :0.004601   Mean   :0.000609  
##  3rd Qu.:0.0015517   3rd Qu.:0        3rd Qu.:0.004977   3rd Qu.:0.000398  
##  Max.   :0.0197691   Max.   :0        Max.   :0.173950   Max.   :0.038651  
##                                                                            
##  urban_industrial   urban_residence     vegetated_edge_railways
##  Min.   :0.000000   Min.   :0.000e+00   Min.   :0.000e+00      
##  1st Qu.:0.000000   1st Qu.:0.000e+00   1st Qu.:0.000e+00      
##  Median :0.000000   Median :0.000e+00   Median :0.000e+00      
##  Mean   :0.001092   Mean   :4.099e-05   Mean   :8.976e-05      
##  3rd Qu.:0.000000   3rd Qu.:0.000e+00   3rd Qu.:0.000e+00      
##  Max.   :0.335749   Max.   :1.157e-02   Max.   :1.271e-01      
##                                                                
##  vegetated_edge_roads   well_aband         well_bitumen     
##  Min.   :0.000000     Min.   :0.0000000   Min.   :0.000000  
##  1st Qu.:0.002604     1st Qu.:0.0003367   1st Qu.:0.000000  
##  Median :0.006764     Median :0.0019160   Median :0.000000  
##  Mean   :0.010682     Mean   :0.0058542   Mean   :0.006039  
##  3rd Qu.:0.013869     3rd Qu.:0.0093228   3rd Qu.:0.005144  
##  Max.   :0.147883     Max.   :0.3045402   Max.   :0.187398  
##                                                             
##    well_cased        well_cleared_not_confirmed well_cleared_not_drilled
##  Min.   :0.0000000   Min.   :0.0000000          Min.   :0.000e+00       
##  1st Qu.:0.0000000   1st Qu.:0.0000000          1st Qu.:0.000e+00       
##  Median :0.0000000   Median :0.0000000          Median :0.000e+00       
##  Mean   :0.0005716   Mean   :0.0002716          Mean   :2.143e-05       
##  3rd Qu.:0.0001940   3rd Qu.:0.0000000          3rd Qu.:0.000e+00       
##  Max.   :0.0685807   Max.   :0.0829690          Max.   :1.469e-02       
##                                                                         
##     well_gas            well_oil   well_other        well_unknown      
##  Min.   :0.0000000   Min.   :0   Min.   :0.000000   Min.   :0.000e+00  
##  1st Qu.:0.0000000   1st Qu.:0   1st Qu.:0.000000   1st Qu.:0.000e+00  
##  Median :0.0000000   Median :0   Median :0.000000   Median :0.000e+00  
##  Mean   :0.0003188   Mean   :0   Mean   :0.001548   Mean   :3.274e-05  
##  3rd Qu.:0.0001151   3rd Qu.:0   3rd Qu.:0.001006   3rd Qu.:0.000e+00  
##  Max.   :0.0572117   Max.   :0   Max.   :0.116479   Max.   :4.813e-03  
## 

Finish covariate data

Save data

Let’s save this merged and cleaned file in case someone wants it and will do their own grouping/exploration (e.g., the next steps in this script).

Make sure when naming files we follow the best data managements practices for the ACME lab outlined here.

# save data in data processed folder
write_csv(covariates_fixed,
          'data/processed/OSM_covariates_merged_2021_2022.csv')

Remove messy data

Now that we’ve merged, cleaned, and reformatted the data we don’t need the list file or messy merged data anymore. Let’s remove these from the environment so we don’t accidentally use them.

rm(covariates_merged,
   covariates)

Data formatting

There are too many covariates to include in the models individually and many of them describe similar HFI features.

Now that this section is finalized, we will use the structure outlined in the covariates_table.docx which can be found in the ‘relevant_literature’ folder of this repository for formatting the covariates for this and future related analyses.

The covariate_table and the README file in this repository include descriptions of each feature from the ABMI human footprints wall to wall data download website for Year 2021; which can also be found in the relevant_literature folder of this repository (HFI_2021_v1_0_Metadata_Final.pdf).

Group covaraites

As we prepare to lump the covariates together, we may need to reference the column names. Let’s print that now so we have it fresh in the console.

names(covariates_fixed)
##   [1] "array"                        "site"                        
##   [3] "buff_dist"                    "airp_runway"                 
##   [5] "borrowpit_dry"                "borrowpit_wet"               
##   [7] "borrowpits"                   "camp_industrial"             
##   [9] "campground"                   "canal"                       
##  [11] "cfo"                          "clearing_unknown"            
##  [13] "clearing_wellpad_unconfirmed" "conventional_seismic"        
##  [15] "country_residence"            "crop"                        
##  [17] "cultivation_abandoned"        "dugout"                      
##  [19] "facility_other"               "facility_unknown"            
##  [21] "fruit_vegetables"             "golfcourse"                  
##  [23] "greenspace"                   "grvl_sand_pit"               
##  [25] "harvest_area"                 "harvest_area_white_zone"     
##  [27] "interchange_ramp"             "lagoon"                      
##  [29] "landfill"                     "lc_class110"                 
##  [31] "lc_class120"                  "lc_class20"                  
##  [33] "lc_class210"                  "lc_class220"                 
##  [35] "lc_class230"                  "lc_class32"                  
##  [37] "lc_class33"                   "lc_class34"                  
##  [39] "lc_class50"                   "low_impact_seismic"          
##  [41] "mill"                         "mines_oilsands"              
##  [43] "mines_pitlake"                "misc_oil_gas_facility"       
##  [45] "oil_gas_plant"                "open_pit_mine"               
##  [47] "peat"                         "pipeline"                    
##  [49] "recreation"                   "reservoir"                   
##  [51] "residence_clearing"           "ris_airp_runway"             
##  [53] "ris_borrowpits"               "ris_camp_industrial"         
##  [55] "ris_clearing_unknown"         "ris_drainage"                
##  [57] "ris_facility_operations"      "ris_facility_unknown"        
##  [59] "ris_mines_oilsands"           "ris_oilsands_rms"            
##  [61] "ris_overburden_dump"          "ris_plant"                   
##  [63] "ris_reclaim_ready"            "ris_reclaimed_certified"     
##  [65] "ris_reclaimed_permanent"      "ris_reclaimed_temp"          
##  [67] "ris_road"                     "ris_soil_replaced"           
##  [69] "ris_soil_salvaged"            "ris_tailing_pond"            
##  [71] "ris_tank_farm"                "ris_transmission_line"       
##  [73] "ris_utilities"                "ris_waste"                   
##  [75] "ris_windrow"                  "rlwy_dbl_track"              
##  [77] "rlwy_mlt_track"               "rlwy_sgl_track"              
##  [79] "rlwy_spur"                    "road_gravel_1l"              
##  [81] "road_gravel_2l"               "road_paved_1l"               
##  [83] "road_paved_2l"                "road_paved_3l"               
##  [85] "road_paved_4l"                "road_paved_5l"               
##  [87] "road_paved_div"               "road_paved_undiv_1l"         
##  [89] "road_paved_undiv_2l"          "road_unclassified"           
##  [91] "road_unimproved"              "road_unpaved_1l"             
##  [93] "road_unpaved_2l"              "road_winter"                 
##  [95] "rough_pasture"                "runway"                      
##  [97] "rural_residence"              "sump"                        
##  [99] "surrounding_veg"              "tailing_pond"                
## [101] "tame_pasture"                 "trail"                       
## [103] "transfer_station"             "transmission_line"           
## [105] "truck_trail"                  "urban_industrial"            
## [107] "urban_residence"              "vegetated_edge_railways"     
## [109] "vegetated_edge_roads"         "well_aband"                  
## [111] "well_bitumen"                 "well_cased"                  
## [113] "well_cleared_not_confirmed"   "well_cleared_not_drilled"    
## [115] "well_gas"                     "well_oil"                    
## [117] "well_other"                   "well_unknown"

Now we will use the mutate() function with some tidyverse trickery (i.e., nesting across() and contains() in rowsums()) to sum across each observation (row) by searching for various character strings. If there isn’t a common character string for multiple variables we want to sum then we provide each one individually. We can also combine these methods (e.g., with ‘facilities’ [see code]).

covariates_grouped <- covariates_fixed %>% 
  
  # rename 'vegetated_edge_roads so that we can use road as keyword to group roads without including this feature
  rename('vegetated_edge_rds' = vegetated_edge_roads) %>% 
  
  # within the mutate function create new column names for the grouped variables
  mutate(
    # borrowpits
    borrowpits = rowSums(across(contains('borrowpit'))) + # here we use rowsums with across() and contains() to sum acrross each row any values for columns that contain the keyword above. Be careful when using that there aren't any variables that match the string (keyword) provided that you don't want to include!
      
      dugout +
      lagoon +
      sump,
    
    
    # clearings
    clearings = rowSums(across(contains('clearing'))) +
      runway,
    
    # cultivations
    cultivation = crop + 
      cultivation_abandoned +
      fruit_vegetables +
      rough_pasture +
      tame_pasture,
    
    # harvest areas
    harvest = rowSums(across(contains('harvest'))),
    
    # industrial facilities
    facilities = rowSums(across(contains('facility'))) +
      rowSums(across(contains('plant'))) +
      camp_industrial +
      mill +
      ris_camp_industrial +
      ris_tank_farm +
      ris_utilities +
      urban_industrial,
    
    # mine areas
    mines = rowSums(across(contains('mine'))) +
      rowSums(across(contains('tailing'))) +
      grvl_sand_pit +
      peat +
      ris_drainage +
      ris_oilsands_rms +
      ris_overburden_dump +
      ris_reclaim_ready +
      ris_soil_salvaged +
      ris_waste,
    
    # railways
    railways = rowSums(across(contains('rlwy'))),
    
    # reclaimed areas
    reclaimed = rowSums(across(contains('reclaimed'))) +
      ris_soil_replaced +
      ris_windrow,
    
    # recreation areas
    recreation = campground +
      golfcourse +
      greenspace +
      recreation,
    
    # residential areas (can't use residence as keyword because 'residence_clearing' is in clearing unless we rearrange groupings or rename that one)
    residential = country_residence +
      rural_residence +
      urban_residence,
    
    # roads (we renamed 'vegetated_edge_roads' above to 'vegetated_edge_rds' so we can use roads as keyword here which saves a bunch of coding as there are many many road variables)
    roads = rowSums(across(contains('road'))) +
      interchange_ramp +
      airp_runway +
      ris_airp_runway +
      transfer_station,
    
    # seismic lines
    seismic_lines = conventional_seismic,
    
    # 3D sesimic lines (put the 3D at the end though to make R happy)
    seismic_lines_3D = low_impact_seismic,
    
    # transmission lines
    transmission_lines = rowSums(across(contains('transmission'))),
    
    # trails
    trails = rowSums(across(contains('trail'))),
    
    # vegetated edges
    veg_edges = rowSums(across(contains('vegetated'))) +
      surrounding_veg,
    
    # man-made water features
    water = canal +
      reservoir,
    
    # well sites (this probably includes 'clearing_wellpad' need to check)
    wells = rowSums(across(contains('well'))),
    
    # remove columns that were used to create new columns to tidy the data frame
         .keep = 'unused') %>% 
  
  # reorder alphabetically except array, site and buff_dist
  select(order(colnames(.))) %>% 
  
  # we want to move the columns that aren't HFI features or landcover to the front
  relocate(.,
           c(array,
             site,
             buff_dist)) %>% 
  
  # reorder variables so the veg data is after all the HFI data
  relocate(starts_with('lc_class'),
           .after = wells)

# see what's left
names(covariates_grouped)
##  [1] "array"              "site"               "buff_dist"         
##  [4] "borrowpits"         "cfo"                "clearings"         
##  [7] "cultivation"        "facilities"         "harvest"           
## [10] "landfill"           "mines"              "pipeline"          
## [13] "railways"           "reclaimed"          "recreation"        
## [16] "residential"        "roads"              "seismic_lines"     
## [19] "seismic_lines_3D"   "trails"             "transmission_lines"
## [22] "veg_edges"          "water"              "wells"             
## [25] "lc_class110"        "lc_class120"        "lc_class20"        
## [28] "lc_class210"        "lc_class220"        "lc_class230"       
## [31] "lc_class32"         "lc_class33"         "lc_class34"        
## [34] "lc_class50"
# check the structure of new data
str(covariates_grouped)
## tibble [4,660 Ă— 34] (S3: tbl_df/tbl/data.frame)
##  $ array             : Factor w/ 6 levels "LU13","LU15",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ site              : Factor w/ 233 levels "LU13_18","LU13_15",..: 1 2 3 4 5 6 7 8 9 10 ...
##  $ buff_dist         : Factor w/ 20 levels "250","500","750",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ borrowpits        : num [1:4660] 0 0 0 0 0 ...
##  $ cfo               : num [1:4660] 0 0 0 0 0 0 0 0 0 0 ...
##  $ clearings         : num [1:4660] 0.0923 0.0697 0 0 0 ...
##  $ cultivation       : num [1:4660] 0 0 0 0 0 0 0 0 0 0 ...
##  $ facilities        : num [1:4660] 0.291 0 0 0 0 ...
##  $ harvest           : num [1:4660] 0 0 0.687 0.337 0 ...
##  $ landfill          : num [1:4660] 0 0 0 0 0 0 0 0 0 0 ...
##  $ mines             : num [1:4660] 0 0.0873 0 0 0 ...
##  $ pipeline          : num [1:4660] 0 0.068 0 0 0.0301 ...
##  $ railways          : num [1:4660] 0 0 0 0 0 0 0 0 0 0 ...
##  $ reclaimed         : num [1:4660] 0 0.0477 0 0 0 ...
##  $ recreation        : num [1:4660] 0 0 0 0 0 0 0 0 0 0 ...
##  $ residential       : num [1:4660] 0 0 0 0 0 0 0 0 0 0 ...
##  $ roads             : num [1:4660] 0 0.0174 0 0 0 ...
##  $ seismic_lines     : num [1:4660] 0 0.03277 0 0.00889 0.01144 ...
##  $ seismic_lines_3D  : num [1:4660] 0 0 0 0 0.0523 ...
##  $ trails            : num [1:4660] 0.00588 0.0028 0 0.01591 0 ...
##  $ transmission_lines: num [1:4660] 0.0642 0 0 0 0.091 ...
##  $ veg_edges         : num [1:4660] 0 0.0858 0 0 0 ...
##  $ water             : num [1:4660] 0 0 0 0 0 0 0 0 0 0 ...
##  $ wells             : num [1:4660] 0 0 0 0 0.0322 ...
##  $ lc_class110       : num [1:4660] 0.193 0.348 0 0 0.178 ...
##  $ lc_class120       : num [1:4660] 0 0 0 0 0 0 0 0 0 0 ...
##  $ lc_class20        : num [1:4660] 0.0361 0 0 0 0 ...
##  $ lc_class210       : num [1:4660] 0.456 0.358 0.186 1 0.822 ...
##  $ lc_class220       : num [1:4660] 0 0 0 0 0 ...
##  $ lc_class230       : num [1:4660] 0 0.101 0.255 0 0 ...
##  $ lc_class32        : num [1:4660] 0 0 0 0 0 0 0 0 0 0 ...
##  $ lc_class33        : num [1:4660] 0 0.101 0 0 0 ...
##  $ lc_class34        : num [1:4660] 0 0.0916 0 0 0 ...
##  $ lc_class50        : num [1:4660] 0.316 0 0.559 0 0 ...
# check summary of new data
summary(covariates_grouped)
##   array          site        buff_dist      borrowpits      
##  LU13:820   LU13_18:  20   250    : 233   Min.   :0.000000  
##  LU15:780   LU13_15:  20   500    : 233   1st Qu.:0.000000  
##  LU21:720   LU13_03:  20   750    : 233   Median :0.001334  
##  LU01:780   LU13_34:  20   1000   : 233   Mean   :0.004175  
##  LU2 :840   LU13_57:  20   1250   : 233   3rd Qu.:0.004419  
##  LU3 :720   LU13_16:  20   1500   : 233   Max.   :0.311103  
##             (Other):4540   (Other):3262                     
##       cfo              clearings          cultivation        facilities      
##  Min.   :0.000e+00   Min.   :0.0000000   Min.   :0.00000   Min.   :0.000000  
##  1st Qu.:0.000e+00   1st Qu.:0.0000000   1st Qu.:0.00000   1st Qu.:0.000000  
##  Median :0.000e+00   Median :0.0004464   Median :0.00000   Median :0.000000  
##  Mean   :5.398e-07   Mean   :0.0051266   Mean   :0.00105   Mean   :0.007104  
##  3rd Qu.:0.000e+00   3rd Qu.:0.0036890   3rd Qu.:0.00000   3rd Qu.:0.003121  
##  Max.   :1.217e-03   Max.   :0.4023522   Max.   :0.18015   Max.   :0.466010  
##                                                                              
##     harvest           landfill     mines             pipeline      
##  Min.   :0.00000   Min.   :0   Min.   :0.000000   Min.   :0.00000  
##  1st Qu.:0.00000   1st Qu.:0   1st Qu.:0.000000   1st Qu.:0.00000  
##  Median :0.00000   Median :0   Median :0.000000   Median :0.01158  
##  Mean   :0.04744   Mean   :0   Mean   :0.004234   Mean   :0.01810  
##  3rd Qu.:0.04085   3rd Qu.:0   3rd Qu.:0.000000   3rd Qu.:0.02619  
##  Max.   :0.83674   Max.   :0   Max.   :0.557858   Max.   :0.28896  
##                                                                    
##     railways          reclaimed          recreation         residential       
##  Min.   :0.00e+00   Min.   :0.000000   Min.   :0.0000000   Min.   :0.0000000  
##  1st Qu.:0.00e+00   1st Qu.:0.000000   1st Qu.:0.0000000   1st Qu.:0.0000000  
##  Median :0.00e+00   Median :0.000000   Median :0.0000000   Median :0.0000000  
##  Mean   :2.92e-05   Mean   :0.000667   Mean   :0.0001176   Mean   :0.0001549  
##  3rd Qu.:0.00e+00   3rd Qu.:0.000000   3rd Qu.:0.0000000   3rd Qu.:0.0000000  
##  Max.   :2.44e-02   Max.   :0.078325   Max.   :0.0289661   Max.   :0.0280500  
##                                                                               
##      roads          seismic_lines      seismic_lines_3D       trails         
##  Min.   :0.000000   Min.   :0.000000   Min.   :0.000000   Min.   :0.0000000  
##  1st Qu.:0.002264   1st Qu.:0.003485   1st Qu.:0.000000   1st Qu.:0.0002191  
##  Median :0.004540   Median :0.006323   Median :0.000000   Median :0.0011395  
##  Mean   :0.006399   Mean   :0.006592   Mean   :0.005522   Mean   :0.0016580  
##  3rd Qu.:0.008663   3rd Qu.:0.009171   3rd Qu.:0.004557   3rd Qu.:0.0022439  
##  Max.   :0.071812   Max.   :0.045512   Max.   :0.087576   Max.   :0.0386512  
##                                                                              
##  transmission_lines   veg_edges            water               wells          
##  Min.   :0.000000   Min.   :0.000000   Min.   :0.000e+00   Min.   :0.0000000  
##  1st Qu.:0.000000   1st Qu.:0.002612   1st Qu.:0.000e+00   1st Qu.:0.0007196  
##  Median :0.000000   Median :0.006847   Median :0.000e+00   Median :0.0061983  
##  Mean   :0.004607   Mean   :0.010867   Mean   :2.504e-05   Mean   :0.0150163  
##  3rd Qu.:0.004977   3rd Qu.:0.014110   3rd Qu.:0.000e+00   3rd Qu.:0.0177990  
##  Max.   :0.173950   Max.   :0.156249   Max.   :1.961e-02   Max.   :0.3045402  
##                                                                               
##   lc_class110        lc_class120          lc_class20        lc_class210    
##  Min.   :0.000000   Min.   :0.0000000   Min.   :0.000000   Min.   :0.0000  
##  1st Qu.:0.009894   1st Qu.:0.0000000   1st Qu.:0.000000   1st Qu.:0.2869  
##  Median :0.036593   Median :0.0000000   Median :0.002998   Median :0.5748  
##  Mean   :0.046553   Mean   :0.0008668   Mean   :0.028712   Mean   :0.5346  
##  3rd Qu.:0.061983   3rd Qu.:0.0000000   3rd Qu.:0.032240   3rd Qu.:0.7873  
##  Max.   :0.731887   Max.   :0.1654590   Max.   :0.519648   Max.   :1.0000  
##                                                                            
##   lc_class220        lc_class230        lc_class32          lc_class33      
##  Min.   :0.000000   Min.   :0.00000   Min.   :0.000e+00   Min.   :0.000000  
##  1st Qu.:0.008999   1st Qu.:0.02228   1st Qu.:0.000e+00   1st Qu.:0.000000  
##  Median :0.071814   Median :0.06057   Median :0.000e+00   Median :0.000000  
##  Mean   :0.140469   Mean   :0.12188   Mean   :1.163e-05   Mean   :0.003658  
##  3rd Qu.:0.230000   3rd Qu.:0.17370   3rd Qu.:0.000e+00   3rd Qu.:0.000000  
##  Max.   :1.000000   Max.   :0.93217   Max.   :1.176e-02   Max.   :0.324028  
##                                                                             
##    lc_class34        lc_class50     
##  Min.   :0.00000   Min.   :0.00000  
##  1st Qu.:0.00000   1st Qu.:0.01039  
##  Median :0.01762   Median :0.04435  
##  Mean   :0.03201   Mean   :0.09125  
##  3rd Qu.:0.04129   3rd Qu.:0.10789  
##  Max.   :0.55710   Max.   :1.00000  
## 
# there are some NAs in the data which will cause problems with modeling/visualization of data ignore for now but will explore these sites specifically after report

covariates_grouped <- covariates_grouped %>% 
  
  # remove rows with NAs
  na.omit()

Grouped histograms

Let’s look at the histograms again and see if we need to remove any features or feature groups without enough data

# use for loop to plot histograms for all covariates

for (col in 5:ncol(covariates_grouped)) {
    hist(covariates_grouped[,col])
}

> IMO we don’t have enough variation in data to use the following features/feature groups

  • cfo
  • clearings
  • Cultivation
  • railways
  • Reclaimed
  • Recreation
  • Residential
  • Water
  • lc_class_20 (aka water)
  • lc_class120 (aka agriculture)
  • lc_class32 (aka rocks and rubble)
  • lc_class33 (aka exposed land)

Also, there’s not a lot of data for the following features, which are similar and of interest to OSM, so in the past they’ve been grouped together and we will here as well

  • Borrowpits
  • Facilities
  • Mines

For this analysis we will also combine facilities and mines

Group covariates further

So let’s modify this data and remove those features for now this step will need to be changed each year likely

Let’s also rename the landcover classes so they make more sense without having to look them up by number (maybe should add this to script earlier for next year)

covariates_grouped <- covariates_grouped %>% 
  
  # create column osm_industrial
  mutate(
    osm_industrial = borrowpits +
    clearings +
    facilities +
    mines,
    
    # remove columns we used to make this variable
    .keep = 'unused') %>% 
  
  # remove other features we don't need
  select(!c(cfo,
            cultivation,
            reclaimed,
            recreation,
            residential,
            water,
            lc_class20,
            lc_class120,
            lc_class32,
            lc_class33,
            landfill,
            railways)) %>%
  
  # rename landcover classes
  rename(
    grassland = lc_class110,
    coniferous = lc_class210,
    broadleaf = lc_class220,
    mixed = lc_class230,
    developed = lc_class34,
    shrub = lc_class50) 

# check that it worked
names(covariates_grouped)
##  [1] "array"              "site"               "buff_dist"         
##  [4] "harvest"            "pipeline"           "roads"             
##  [7] "seismic_lines"      "seismic_lines_3D"   "trails"            
## [10] "transmission_lines" "veg_edges"          "wells"             
## [13] "grassland"          "coniferous"         "broadleaf"         
## [16] "mixed"              "developed"          "shrub"             
## [19] "osm_industrial"

Save grouped data

Let’s save this data now that it’s all formatted and grouped.

write_csv(covariates_grouped,
          'data/processed/OSM_covariates_grouped_2021_2022.csv')

Remove messy data

Let’s remove the data frames we no longer need.

rm(covariates_fixed)

Data exploration

Subset data by buffer

We need to subset the data so we have separate data frames for each buffer width to work with in the analysis AND to explore correlation between variables at each buffer width, as these may very with spatial scales

Let’s use a for loop to subset the data

buffer_frames <- list()

for (i in unique(covariates_grouped$buff_dist)){
  
  print(i)
  
  # Subset data based on radius
  df <- covariates_grouped %>%
    filter(buff_dist == i)
  
  # list of dataframes
  buffer_frames <-c (buffer_frames, list(df))
}
## [1] "250"
## [1] "500"
## [1] "750"
## [1] "1000"
## [1] "1250"
## [1] "1500"
## [1] "1750"
## [1] "2000"
## [1] "2250"
## [1] "2500"
## [1] "2750"
## [1] "3000"
## [1] "3250"
## [1] "3500"
## [1] "3750"
## [1] "4000"
## [1] "4250"
## [1] "4500"
## [1] "4750"
## [1] "5000"
# name list objects so we can extract names for plotting 

buffer_frames <- buffer_frames %>% 
  
  # absurdly long way to do this but for sake of time fuck it
  purrr::set_names('250 meter buffer',
                   '500 meter buffer',
                   '750 meter buffer',
                   '1000 meter buffer',
                   '1250 meter buffer',
                   '1500 meter buffer',
                   '1750 meter buffer',
                   '2000 meter buffer',
                   '2250 meter buffer',
                   '2500 meter buffer',
                   '2750 meter buffer',
                   '3000 meter buffer',
                   '3250 meter buffer',
                   '3500 meter buffer',
                   '3750 meter buffer',
                   '4000 meter buffer',
                   '4250 meter buffer',
                   '4500 meter buffer',
                   '4750 meter buffer',
                   '5000 meter buffer')

Now we have a list with data frames for each buffer width which we can work with later.

We will need to repeat this step in the analysis script

Autocorellation

Correlation plots

Now we need to make correlation plots for each buffer width to see what variables are correlated at a given spatial scale. We can use purrr::map() with the chart.Correlation() function from the PerformanceAnalytics package to make correlation plots with a specified method (e.g., pearson, spearman, etc.) That also show histograms and scatterplots of each variable.

correlation_plots <- buffer_frames %>% 
  
  purrr::map(
    ~.x %>% 
      
      # select numeric variables only since we can't compute a r2 for non-numeric
      select_if(is.numeric) %>% 
      
      # use chart.correlation 
      chart.Correlation(.,
                        histogram = TRUE, 
                        method = "pearson")
  )

There is a section for each buffer width outlining the variables that are autocorrelated and thus should not be included in the same model, it includes the r2 as well

250m

buffer_frames$`250 meter buffer` %>% 
  
  select_if(is.numeric) %>% 
  
   # use chart.correlation 
      chart.Correlation(.,
                        histogram = TRUE, 
                        method = "pearson")

mtext('250 meter buffer', side = 3, line = 3)

  • pipeline & transmission_lines 0.53
  • roads & veg_edges 0.71
  • roads & lc_developed 0.57

500m

buffer_frames$`500 meter buffer` %>% 
  
  select_if(is.numeric) %>% 
  
   # use chart.correlation 
      chart.Correlation(.,
                        histogram = TRUE, 
                        method = "pearson")

mtext('500 meter buffer', side = 3, line = 3)

  • harvest & lc_mixed
  • pipeline & transmission_lines 0.64
  • pipeline & roads 0.59
  • pipeline & veg edges 0.57
  • pipeline & wells 0.75
  • pipeline & lc_grassland 0.62
  • roads & veg edges 0.79
  • roads & wells 0.75
  • roads & lc_developed 0.75
  • roads & osm_industrial 0.72
  • veg edges & wells 0.61
  • veg edges & lc_developed 0.88
  • wells & lc_grassland 0.61
  • wells & lc_developed 0.67
  • coniferous & broadleaf -0.77

750m

Exploratory plots

add more to this section in later when we have more time to explore the covariates and choose which should be inlcuded etc.

# use this code to change figure margins otherwise will not plot because figure margines are too large
par(mar=c(1,1,1,1))

# now use purrr to plot histograms for all remaining HFI variables for each buffer
hfi_histograms <- buffer_frames %>% 
  
  purrr::imap(
    ~.x %>% 
      
      # filter to just the HFI variables 
      select(where(is.numeric) &
          ! starts_with('lc_class')) %>% 
      
      # pipe into hist.data.frame function to make histograms for each variable
      hist.data.frame(mtitl = paste0('Histograms of HFI variables at ', .y)))

Now let’s do the same thing with the landcover variables

lc_histograms <- buffer_frames %>% 
  
  purrr::imap(
    ~.x %>% 
      
      # filter to just the landcover variables 
      select(where(is.numeric) &
          starts_with('lc_class')) %>% 
      
      # pipe into hist.data.frame function to make histograms for each variable
      hist.data.frame(mtitl = paste0('Histograms of landcover variables at ', .y)))